[time 282] Re: [time 281] Re: Fisher information

Matti Pitkanen (matpitka@pcu.helsinki.fi)
Thu, 6 May 1999 20:11:33 +0300 (EET DST)

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Next message: Hitoshi Kitada: "[time 283] correction on numbering"
Previous message: Stephen Paul King: "[time 281] Re: Fisher information"

I am enthusiastic about information theoretic interpreation
of action and have proposed my own TGD based interpretation.
I am however not enthusiastic about some features of
Fisher information.

a) Each type of measurement defines its own action and this
looks very suspicious. We perform all kinds of measurements, not
only position measurements.

b) The use of specific information measure seems also suspicious.

c) I got the impression that one can derive Laplace equations
from Fisher information but I could not understand how d'Alembertian
type equations are obtained naturally: the problem is
Minkowski signature: Box =partial_t^2 -nabla^2. If I understood correctly
(perhaps I didn't) this requires the allowance of imaginary value
for the parameter, I think it was called theta, appearing in Fisher
information.

d) The division of action I-J to a difference of two kinds of
informations: the total information J contained by system and the maximum
achievable information. If I understood correctly, this decomposition
fails to be General Coordinate invariant
in case of Maxwell action for which I correponds to magnetic energy
and J to electric energy. I might be wrong: any opinions?

[In TGD the negative of Kahler function(= absolute minimum of Kahler
action) has interpretation as an entropy type measure for cognitive
resources of 3-surface measured as the number of degenerate absolute
minima going through this surface. Absolute minimization of Kahler action
implies that cognitive resources are maximal. Also quantum criticality
of TGD can be understood in terms of optimization of cognition:
universe is maximally interesting and intelligent at quantum
criticality.]

MP

On Thu, 6 May 1999, Stephen Paul King wrote:

> On Wed, 5 May 1999 21:37:26 -0700, " r e s" <XXrs.1@ix.netcom.com>
> wrote:
>
> >Sometimes it's hard to see the simple "intuitive meaning"
> >of Fisher information in spite of (or because of?) the
> >increasing mathematical interest in it.
> >
> >Here is a sketch in heuristic terms:
> >
> >Suppose that an observation x has a probability density
> >parameterized by c, viz., p(x|c). (Let's look at the
> >case of 1-dimensional c, and note that the following
> >easily generalizes to higher dimensions.)
> >
> >After x is observed, one may ask "what value(s) of c
> >would assign the greatest probability density p(x|c)
> >to the actual x that has been observed?" This naturally
> >leads one to consider the shape of p(x|c), or (as it
> >turns out to be more convenient) of ln p(x|c), as a
> >function of c for the fixed observed x. Thus ln p(x|c)
> >is a curve in the parameter space of c, and one is
> >interested in where are its peaks (i.e. the values of c
> >that maximize ln p(x|c)), and how "sharply peaked" is
> >the curve. The sharper is ln p(x|c), i.e. the greater
> >the curvature wrt c, the greater is the "information
> >about c" provided by the observation x. Greater
> >curvature of log p(x|c) wrt c means that more
> >information about c is conveyed by x, because the
> >"most likely" values of c are thereby more sharply
> >discriminated.
> >
> >Now the curvature (wrt c) of ln p(x|c) is -@@ ln p(x|c),
> >where @ denotes derivative wrt c. This is sometimes
> >called the "observed information" about c provided by the
> >given observation x. The Fisher information is now the
> >sampling average of this, namely I(c)=E[-@@ ln p(x|c)].
> >
> >It's also interesting to notice that ln p(x|c) expanded
> >about a maximum at, say c0, is
> >
> >ln p(x|c) = ln p(x|c0) - (1/2)I(c0)(c-c0)^2 + ...
> >or
> >p(x|c) = p(x|c0)*exp[-(1/2)I(c0)(c-c0)^2 + ...]
> >
> >which, to a Bayesian, opens a door in some circumstances
> >for approximating the posterior distribution of c given x
> >as Normal with mean E(c|x)=c0, variance var(c|x)=1/I(c0).
> >
> >BTW,
> >E[-@@ ln p(x|c)] = var[@ ln p(x|c)] = E[(@ ln p(x|c))^2]
> >
> >where the expectation E and variance var are wrt the
> >sampling distribution p(x|c).
> >
> >--
> > r e s (Spam-block=XX)
> >
> >
> >Stephen Paul King <stephenk1@home.com> wrote ...
> >> stephenk1@home.com (Stephen Paul King) wrote:
> >[...]
> >> I have assembled a link page on Fisher information and have a
> >> definition: "The Fisher Information about a parameter is defined to
> >> be \theta the expectation of the second derivative of the
> >> loglikelihood."
> >> http://members.home.net/stephenk1/Outlaw/fisherinfo.html
> >> But I am still needing an intuitive grasp of that it means. :)
> >
> >
> >
>
>

Next message: Hitoshi Kitada: "[time 283] correction on numbering"
Previous message: Stephen Paul King: "[time 281] Re: Fisher information"

This archive was generated by hypermail 2.0b3 on Sun Oct 17 1999 - 22:10:30 JST