[time 271] Other's thoughts on randomness & statistics

Stephen P. King (stephenk1@home.com)
Mon, 03 May 1999 08:33:29 -0400

Subject: Re: True Randomness & The Law Of Large Numbers
Date: 18 Apr 1999 00:00:00 GMT
From: hrubin@odds.stat.purdue.edu (Herman Rubin)
Organization: Purdue University Statistics Department
Newsgroups: sci.crypt

In article <3714be1c.843122@nntp.ix.netcom.com>,
R. Knauer <rcktexas@ix.netcom.com> wrote:
>On 14 Apr 1999 09:11:32 -0500, hrubin@odds.stat.purdue.edu (Herman
>Rubin) wrote:

>>>I meant for finite sequences. An infinite sequence must be Borel
>>>normal, which is a quantitive entity. (But see below for further

>>I suggest we ignore infinite sequences. We will never observe one.

>It is interesting to note that Billingsley begins his book on
>probability and measure with Borel normality and infinite sequences.
>He seems to be saying that only in such a manner can one determine the
>precise meaning of probability.

This is an attempt to define probability. It is neither necessary
not useful.

It is most unfortunate that people feel it must be defined. One need
only assume it exists, and has certain properties. Then it follows
that almost all infinite sequences of independent identically
distributed trials will have Borel normality. This holds whether
or not there will be infinite sequences. If it helps you understand
probability, fine; if not, ignore it.

>>No, the statement is not at all harsh. Like most people, physicists
>>are comfortable only with the ideas coming from games of chance with
>>equally likely alternatives, so they set up this "ensemble" from
>>which one selects "at random". It is necessary to get away from
>>this restrictive idea.

>Your statement is very interesting, because it constitutes a challenge
>on two fronts:

>1) That physicists are in error when they use their traditional
>notions of probability, including ensembles;

They are postulating something not needed. Where is the ensemble?
This is a real violation of Occam's Razor, far more serious than
the belief in simplicity.

>2) That physicists must adopt new ways of thinking, about which you
>can comment further.

The one new way is to allow the existence of something without
defining it. Distance and time are not really defined any more.
Physicists are quite willing to live with limitations on how
accurately they can be determined, especially from direct observation,
which is rarely being done now. They have yet to come to terms
with the same thing in probability.

>I look forward to those further comments from you.

>Are physicists who are exploring the "physics of information" at
>places like the Sante Fe Institute and others, going in the right
>direction according to your thinking?

I am not sure what they are doing.

>>The concise definition is that any measurable event determined by
>>the entire collection of observations has a probability.

>That is true for orthodox QM. Those probabilities are determined from
>"probability amplitudes", which are related to projections of the wave
>vector in the Hilbert space representation for which the Hamiltonian
>matrix is diagonal.

This is not quite right. Their projections commute, and hence they
can be simultaneously diagonalized.

>Apparently I am missing something from your original comment that
>there are joint probabilities in QM.

For any simultaneous observables, there is a joint probability
distribution of the set of values of all of them. If observables
do not come from commuting operators, it may still be true that
the formal computation of their joint distribution can be a
probability distribution, but this is not always the case.


>>The scale for randomness I am using is similar to the expected number
>>of bits which would have to be changed, knowing all the probabilities,
>>and having a TRNG available, to get a TRNG.

>That appears at first glance to be a kind of entropic measure. IOW, a
>TRNG is 100% random if has 100% entropy. Is that not just a statement
>of the independence of each of the bits of the sequence, that they can
>be selected equiprobably from the sample space {0,1}? After all, if
>one of those bits is fixed (i.e., it is known), then the entropy drops
>from its maximum by one bit, and the TRNG loses that amount of
>randomness, say to a level of 99.999% or whatever.

It is not merely a matter of total entropy. If one has a discrete
distribution with total Wiener-Shannon information k, a Huffman
coding will produce a result with the expected number of bits
less than k+1, and one can produce a coding so that the probabilities
of the events can be exactly reproduced with the expected number
of perfect random bits used less than k+2. But one need not be able
to produce a single random bit from the observation.

If one has 999,999 perfect random bits, and adds these mod 2 to
produce bit 1,000,000, the information will be 999,999 bits.
But if one is interested in the parity of the number of 1's,
this is useless. My "definition" would not accept this. Now
it would be difficult to test for this; however, I do not consider
this type of failure of randomness to be of much concern for
physical generators, while I certainly would for pseudo-random

>If that is correct, how do you go about measuring this entropic-like
>property quantitatively without having to sample all or a large
>fraction of the possible sequences (which is overwhelmingly greater
>than just measuring the properties of one relatively small sequence)?

You miss the idea of statistical testing. It is necessary to balance
risks, and the fundamental situation in probability is the experiment
which cannot ever be replicated.

This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

This archive was generated by hypermail 2.0b3 on Sun Oct 17 1999 - 22:10:30 JST