**Stephen P. King** (*stephenk1@home.com*)

*Sun, 18 Apr 1999 13:58:41 -0400*

**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]**Next message:**Stephen P. King: "[time 247] Re: [time 237] Direction of time or Free will"**Previous message:**Stephen P. King: "[time 245] Discussion of Mackey's work with Huw Price"

*>Date: Wed, 23 Sep 1998 22:22:25 -0400
*

*>To: Michael C. Mackey <mackey@cnd.mcgill.ca>
*

*>From: Stephen Paul King <spking1@mindspring.com>
*

*>Subject: "Mackey's mistake"
*

*>
*

*>Dear Prof. Mackey,
*

Return-Path: <mackey@mines.cnd.mcgill.ca>

Date: Thu, 24 Sep 1998 21:55:40 -0400 (EDT)

From: Michael Mackey <mackey@cnd.mcgill.ca>

To: Stephen Paul King <spking1@mindspring.com>

Subject: Re: "Mackey's mistake"

Dear Stephen King,

Thank you for your question. This is not the first time that I have had

this directed at me, the first being last May when it was pointed out

that Mr. Hillman had posted a comment about our (Lasota and my) work

that

was certainly open to interpretation. Most people interpreted the

comment to mean that we had made obvious mistakes.

I will try to reply to your question about Hillman's comments as clearly

and

simply as possible about how I view the situation.

As I read the comments that Hillman sent you, there seem to be two areas

of disagreement:

1. Hillman objects that we have picked a definition for entropy when

the

state variable (x) is continuous, e.g.

H(f) = -\int f(x) log f(x) dx

that is incompatible with the discrete quantity

H(p) = -sum p_j log p_j

However, if you go to our Chapter 9 of Chaos, Fractals and Noise, we

NEVER claimed that this was the case. Nor, as he claims, did we ever

claim that our conditional entropy was "an integral analog of ... ".

Simply not true. We simply DEFINED quantities (entropy and conditional

entropy) and then set out to prove properties that they satisfied.

2. Hillman's other objection is that we have picked the "incorrect"

definitions for entropy and conditional entropy. Incorrect by whose

definition?

Our view is the following (one can take any point of view that they

wish, but

usually one is picked with some justification.) The concept of entropy

was

developed in PHYSICS, not mathematics. We defined the Boltzman-Gibbs

entropy

in complete analogy with the definitions of Boltzman and Gibbs in their

attempts to find dynamical analogs of thermodynamic behaviour. Namely,

if

one consults:

* >L. Boltzman, "Lectures on Gas Theory", reprinted by Dover (1995)
*

there one finds a quantity H defined (page 50) in terms of the single

particle density and then this is related by Boltzman to the entropy in

the note on page 133 by "-H = entropy". Rewriting Boltzman's

expressions

one obtains an expression that looks like our definition of entropy (but

one must be careful about the interpretation of "f"--see below);

* >J.W. Gibbs, "Elementary Principles in Statistical Mechanics",
*

reprinted by Dover (1960) you will find extensive treatment of what he

calls the "index of probability" which, in modern terms, is the log of

the density function f, i.e. "index of probability - -log f". Gibbs on

page 44 says "the average index of probability with its sign

reversed corresponds to entropy". Now the average of the index of

probability with its sign reversed is simply

- \int f(x) log f(x) dx

which is precisely the definition of entropy that is used in Lasota and

Mackey, and in my 1992 book "Time's Arrow" published by Springer.

* >It is important to realize when reading Boltzman and Gibbs that they
*

have two very different interpretations of what "f" means--and this is

the

basis of the fundamental philosophical differences between the two

approaches. E.T. Jaynes has pointed out these differences very clearly

in an

article in the American Journal of Physics in the early 1960's (I think

the

reference is AJP (1963), vol 31, page 66 et seq. but I don't have the

paper

here at home. It is entitled "Boltzman versus Gibbs entropy" or

something

like that--if you can't find it, write and Ill get the exact reference)

where

he showed that the Gibbs defintion of entropy gives correct results for

a gas

of interacting particle, whereas the Boltzman result is in error because

it

neglects the interaction energies between molecules. It is only in the

case

of non-interacting particles that the two definitions of entropy give

identical results.

* >Numerous other more recent derivative sources in the physics
*

literature can be consulted that use the same sign conventions for the

entropy. It is only when one strays into the mathematics literature

that

things become altered.

3. The issue of the sign convention in the definition of the

conditional

entropy follows directly from the definition adopted in the definition

of

entropy. If you wish to have them agree in obvious cases, the the

conditional entropy must also have a negative sign.

This, then, I hope explains what I feel are the differences that Hillman

and

we have in our outlook. If any point is unclear, please feel free to

write

back and I will try to clarify it. Lasota and I have taken a point of

view

derived from the origins of the entropy concept--namely the attempt that

started with Hemholtz, Jeans, Boltzman, Clausius, Gibbs and others to

find a

satisfactory dynamical foundation for the laws of thermodynamics.

I want to thank you for writing and asking me to respond to Hillman's

comments. I personally find it disturbing to have my work labeled as

incorrect or misleading in public (as was the case last May) and private

forums (as in the reply he wrote to you) without being given the chance

to

respond directly. If you are in agreement, I would like to forward a

copy of

this note to Hillman.

On another issue, I was intrigued by your comment at the end of your

note

of yesterday evening about your work with Kitada. If, and when, you

feel

like sharing it I would be interested to know what you are doing.

Kind regards,

Michael Mackey

On Wed, 23 Sep 1998, Stephen Paul King wrote:

*> Dear Prof. Mackey,
*

*>
*

*> What do you make of this?
*

*> ---
*

*> Return-Path: <hillman@math.washington.edu>
*

*> Date: Fri, 27 Mar 1998 11:25:49 -0800 (PST)
*

*> From: Chris Hillman <hillman@math.washington.edu>
*

*> To: Stephen Paul King <spking1@mindspring.com>
*

*> Subject: Re: Mackey's mistake
*

*>
*

*>
*

*> On Fri, 27 Mar 1998, Stephen Paul King wrote:
*

*>
*

*> > Dear Chris,
*

*> >
*

*> > Could you elaborate on Mackey and Lasota's mistake? I have read most of
*

*> > his papers and have talked to Mackey a little about his ideas, and I would
*

*> > like to know were he is going wrong. I really appreciate your interest in
*

*> > the study of entropy and your thoughful replies to my querries on the news
*

*> > groups.
*

*>
*

*> I like to say that "all theorems are either wrong or misunderstood".
*

*>
*

*> Their error is one of -interpretation-. I am never suprised when a
*

*> mathematician makes an error of interpretation (these are often minor) but
*

*> this one is quite serious and very elementary, which makes it all the more
*

*> astonishing. (I'd prefer to believe its an error rather than a deliberate
*

*> misrepresentation; possibly one of the authors chose to misrepresent
*

*> something without realizing the gravity of their error of omission.)
*

*>
*

*> We're talking about section 9.2 of Chaos, Fractals and Noise. Definition
*

*> 9.2.1 has the wrong sign. Lasota & Mackey define "conditional entropy"
*

*> to be an integral analog of the quantity
*

*>
*

*> H(p|q) = sum p_j log (q_j/p_j)
*

*>
*

*> Lets talk about this discrete quantity for a moment.
*

*>
*

*> First, it should be called something else to avoid confusion with the
*

*> conditional entropies of Shannon. For instance, Cover & Thomas, Elements
*

*> of Information Theory, Wiley, 1991, call it "relative entropy" and define
*

*> it correctly, as do all the other books I've seen. Other authors call it
*

*> "cross entropy" or "divergence" or "discrimination" or "Kullback-Liebler
*

*> entropy".
*

*>
*

*> Second, it is not obvious, but if defined with the correct sign,
*

*>
*

*> D(p||q) = sum p_j log (p_j/q_j) = -H(p|q)
*

*>
*

*> is non-negative and moreover -decreases- (i.e. gets closer to zero from
*

*> above) under various operations. See Cover & Thomas.
*

*>
*

*> Third, D(p||q) was introduced by Kullback in a context where it is clear
*

*> that it should be a positive quantity with the opposite sign from the
*

*> quantity defined by L&M, and -every- author since then -but- L&M define it
*

*> with that sign (I must have looked at hundreds of books and papers which
*

*> discuss this very interesting quantity). You can easily check this; the
*

*> book by Kullback on statistics and information theory has just been
*

*> reprinted and I recently saw it in a Borders Books near Washington, DC.
*

*> The best interpretion (and I know about five) of divergence is in terms of
*

*> the theory of types, and is explained in Cover & Thomas. This
*

*> interpretation is very clear and obviously "correct".
*

*>
*

*> Fourth, notice that if written in the form I've given, if you goof and
*

*> write q/p when you mean p/q, you change the sign.
*

*>
*

*> Now, Lasota & Mackey are talking about
*

*>
*

*> D(f||g) = int f log f/g
*

*>
*

*> (defined with the "wrong" sign). It turns out that
*

*>
*

*> H(f) = -int f log f
*

*>
*

*> is NOT analogous to
*

*>
*

*> H(p) = -sum p_j log p_j
*

*>
*

*> but the integral divergence IS the proper analog of the discrete
*

*> divergence. This is a long, long story I don't have time to go into now,
*

*> but some idea might be gained from looking at the book by Guiasu,
*

*> Information Theory and its Applications.
*

*>
*

*> The point is, if you change the sign, you pass from a positive quantity
*

*> which decreases towards zero under various operations, in particular
*

*>
*

*> D(Pf||Pg) <= D(f||g) (*)
*

*>
*

*> where P is a Perron-Frobenius operator.
*

*>
*

*> Now, Lasota and Mackey say that their quantity obeys the law
*

*>
*

*> H(Pf||Pg) >= H(f||g)
*

*>
*

*> True, but it is very seriously misleading to fail to point out
*

*> that this is a -negative- quantity increasing towards zero; this leaves
*

*> the impression this law "justifies" the Second Law. In fact it is
*

*> something quite different. Their sign change obscures the correct
*

*> interpretion of divergence according to the theory of types (which makes
*

*> the law (*) quite intuitive).
*

*>
*

*> There's a lot more to this (I've thought hard, though not for several
*

*> years, about the interpretation of divergence) but for the moment this
*

*> will have to suffice, since I'm working hard to finish my thesis by June
*

*> :-) Time permitting, if I get an academic job, I plan to write up a more
*

*> complete critique and mail it to them.
*

*>
*

*> Chris Hillman
*

*>
*

*> ---
*

*>
*

*> I am having difficulties understanding what to make of these statements. I
*

*> am working with Hitoshi Kitada on a model of time, and am trying to see if
*

*> our model satisfies your "exactness" criterion. Thank you for your time. :)
*

*>
*

*> Kind regards,
*

*>
*

*> Stephen Paul King
*

**Next message:**Stephen P. King: "[time 247] Re: [time 237] Direction of time or Free will"**Previous message:**Stephen P. King: "[time 245] Discussion of Mackey's work with Huw Price"

*
This archive was generated by hypermail 2.0b3
on Sun Oct 17 1999 - 22:31:52 JST
*