Stephen Paul King (email@example.com)
Thu, 06 May 1999 14:40:19 GMT
On Wed, 5 May 1999 21:37:26 -0700, " r e s" <XXrs.firstname.lastname@example.org>
>Sometimes it's hard to see the simple "intuitive meaning"
>of Fisher information in spite of (or because of?) the
>increasing mathematical interest in it.
>Here is a sketch in heuristic terms:
>Suppose that an observation x has a probability density
>parameterized by c, viz., p(x|c). (Let's look at the
>case of 1-dimensional c, and note that the following
>easily generalizes to higher dimensions.)
>After x is observed, one may ask "what value(s) of c
>would assign the greatest probability density p(x|c)
>to the actual x that has been observed?" This naturally
>leads one to consider the shape of p(x|c), or (as it
>turns out to be more convenient) of ln p(x|c), as a
>function of c for the fixed observed x. Thus ln p(x|c)
>is a curve in the parameter space of c, and one is
>interested in where are its peaks (i.e. the values of c
>that maximize ln p(x|c)), and how "sharply peaked" is
>the curve. The sharper is ln p(x|c), i.e. the greater
>the curvature wrt c, the greater is the "information
>about c" provided by the observation x. Greater
>curvature of log p(x|c) wrt c means that more
>information about c is conveyed by x, because the
>"most likely" values of c are thereby more sharply
>Now the curvature (wrt c) of ln p(x|c) is -@@ ln p(x|c),
>where @ denotes derivative wrt c. This is sometimes
>called the "observed information" about c provided by the
>given observation x. The Fisher information is now the
>sampling average of this, namely I(c)=E[-@@ ln p(x|c)].
>It's also interesting to notice that ln p(x|c) expanded
>about a maximum at, say c0, is
>ln p(x|c) = ln p(x|c0) - (1/2)I(c0)(c-c0)^2 + ...
>p(x|c) = p(x|c0)*exp[-(1/2)I(c0)(c-c0)^2 + ...]
>which, to a Bayesian, opens a door in some circumstances
>for approximating the posterior distribution of c given x
>as Normal with mean E(c|x)=c0, variance var(c|x)=1/I(c0).
>E[-@@ ln p(x|c)] = var[@ ln p(x|c)] = E[(@ ln p(x|c))^2]
>where the expectation E and variance var are wrt the
>sampling distribution p(x|c).
> r e s (Spam-block=XX)
>Stephen Paul King <email@example.com> wrote ...
>> firstname.lastname@example.org (Stephen Paul King) wrote:
>> I have assembled a link page on Fisher information and have a
>> definition: "The Fisher Information about a parameter is defined to
>> be \theta the expectation of the second derivative of the
>> But I am still needing an intuitive grasp of that it means. :)
This archive was generated by hypermail 2.0b3 on Sun Oct 17 1999 - 22:10:30 JST