**Matti Pitkanen** (*matpitka@pcu.helsinki.fi*)

*Thu, 6 May 1999 20:11:33 +0300 (EET DST)*

**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]**Next message:**Hitoshi Kitada: "[time 283] correction on numbering"**Previous message:**Stephen Paul King: "[time 281] Re: Fisher information"

I am enthusiastic about information theoretic interpreation

of action and have proposed my own TGD based interpretation.

I am however not enthusiastic about some features of

Fisher information.

a) Each type of measurement defines its own action and this

looks very suspicious. We perform all kinds of measurements, not

only position measurements.

b) The use of specific information measure seems also suspicious.

c) I got the impression that one can derive Laplace equations

from Fisher information but I could not understand how d'Alembertian

type equations are obtained naturally: the problem is

Minkowski signature: Box =partial_t^2 -nabla^2. If I understood correctly

(perhaps I didn't) this requires the allowance of imaginary value

for the parameter, I think it was called theta, appearing in Fisher

information.

d) The division of action I-J to a difference of two kinds of

informations: the total information J contained by system and the maximum

achievable information. If I understood correctly, this decomposition

fails to be General Coordinate invariant

in case of Maxwell action for which I correponds to magnetic energy

and J to electric energy. I might be wrong: any opinions?

[In TGD the negative of Kahler function(= absolute minimum of Kahler

action) has interpretation as an entropy type measure for cognitive

resources of 3-surface measured as the number of degenerate absolute

minima going through this surface. Absolute minimization of Kahler action

implies that cognitive resources are maximal. Also quantum criticality

of TGD can be understood in terms of optimization of cognition:

universe is maximally interesting and intelligent at quantum

criticality.]

MP

On Thu, 6 May 1999, Stephen Paul King wrote:

*> On Wed, 5 May 1999 21:37:26 -0700, " r e s" <XXrs.1@ix.netcom.com>
*

*> wrote:
*

*>
*

*> >Sometimes it's hard to see the simple "intuitive meaning"
*

*> >of Fisher information in spite of (or because of?) the
*

*> >increasing mathematical interest in it.
*

*> >
*

*> >Here is a sketch in heuristic terms:
*

*> >
*

*> >Suppose that an observation x has a probability density
*

*> >parameterized by c, viz., p(x|c). (Let's look at the
*

*> >case of 1-dimensional c, and note that the following
*

*> >easily generalizes to higher dimensions.)
*

*> >
*

*> >After x is observed, one may ask "what value(s) of c
*

*> >would assign the greatest probability density p(x|c)
*

*> >to the actual x that has been observed?" This naturally
*

*> >leads one to consider the shape of p(x|c), or (as it
*

*> >turns out to be more convenient) of ln p(x|c), as a
*

*> >function of c for the fixed observed x. Thus ln p(x|c)
*

*> >is a curve in the parameter space of c, and one is
*

*> >interested in where are its peaks (i.e. the values of c
*

*> >that maximize ln p(x|c)), and how "sharply peaked" is
*

*> >the curve. The sharper is ln p(x|c), i.e. the greater
*

*> >the curvature wrt c, the greater is the "information
*

*> >about c" provided by the observation x. Greater
*

*> >curvature of log p(x|c) wrt c means that more
*

*> >information about c is conveyed by x, because the
*

*> >"most likely" values of c are thereby more sharply
*

*> >discriminated.
*

*> >
*

*> >Now the curvature (wrt c) of ln p(x|c) is -@@ ln p(x|c),
*

*> >where @ denotes derivative wrt c. This is sometimes
*

*> >called the "observed information" about c provided by the
*

*> >given observation x. The Fisher information is now the
*

*> >sampling average of this, namely I(c)=E[-@@ ln p(x|c)].
*

*> >
*

*> >It's also interesting to notice that ln p(x|c) expanded
*

*> >about a maximum at, say c0, is
*

*> >
*

*> >ln p(x|c) = ln p(x|c0) - (1/2)I(c0)(c-c0)^2 + ...
*

*> >or
*

*> >p(x|c) = p(x|c0)*exp[-(1/2)I(c0)(c-c0)^2 + ...]
*

*> >
*

*> >which, to a Bayesian, opens a door in some circumstances
*

*> >for approximating the posterior distribution of c given x
*

*> >as Normal with mean E(c|x)=c0, variance var(c|x)=1/I(c0).
*

*> >
*

*> >BTW,
*

*> >E[-@@ ln p(x|c)] = var[@ ln p(x|c)] = E[(@ ln p(x|c))^2]
*

*> >
*

*> >where the expectation E and variance var are wrt the
*

*> >sampling distribution p(x|c).
*

*> >
*

*> >--
*

*> > r e s (Spam-block=XX)
*

*> >
*

*> >
*

*> >Stephen Paul King <stephenk1@home.com> wrote ...
*

*> >> stephenk1@home.com (Stephen Paul King) wrote:
*

*> >[...]
*

*> >> I have assembled a link page on Fisher information and have a
*

*> >> definition: "The Fisher Information about a parameter is defined to
*

*> >> be \theta the expectation of the second derivative of the
*

*> >> loglikelihood."
*

*> >> http://members.home.net/stephenk1/Outlaw/fisherinfo.html
*

*> >> But I am still needing an intuitive grasp of that it means. :)
*

*> >
*

*> >
*

*> >
*

*>
*

*>
*

**Next message:**Hitoshi Kitada: "[time 283] correction on numbering"**Previous message:**Stephen Paul King: "[time 281] Re: Fisher information"

*
This archive was generated by hypermail 2.0b3
on Sun Oct 17 1999 - 22:10:30 JST
*