[time 281] Re: Fisher information

Stephen Paul King (stephenk1@home.com)
Thu, 06 May 1999 14:40:19 GMT

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Next message: Matti Pitkanen: "[time 282] Re: [time 281] Re: Fisher information"
Previous message: Stephen P. King: "[time 280] Do Questions Asked Define The Laws Of Physics?"
Next in thread: Matti Pitkanen: "[time 282] Re: [time 281] Re: Fisher information"

On Wed, 5 May 1999 21:37:26 -0700, " r e s" <XXrs.1@ix.netcom.com>
wrote:

>Sometimes it's hard to see the simple "intuitive meaning"
>of Fisher information in spite of (or because of?) the
>increasing mathematical interest in it.
>
>Here is a sketch in heuristic terms:
>
>Suppose that an observation x has a probability density
>parameterized by c, viz., p(x|c). (Let's look at the
>case of 1-dimensional c, and note that the following
>easily generalizes to higher dimensions.)
>
>After x is observed, one may ask "what value(s) of c
>would assign the greatest probability density p(x|c)
>to the actual x that has been observed?" This naturally
>leads one to consider the shape of p(x|c), or (as it
>turns out to be more convenient) of ln p(x|c), as a
>function of c for the fixed observed x. Thus ln p(x|c)
>is a curve in the parameter space of c, and one is
>interested in where are its peaks (i.e. the values of c
>that maximize ln p(x|c)), and how "sharply peaked" is
>the curve. The sharper is ln p(x|c), i.e. the greater
>the curvature wrt c, the greater is the "information
>about c" provided by the observation x. Greater
>curvature of log p(x|c) wrt c means that more
>information about c is conveyed by x, because the
>"most likely" values of c are thereby more sharply
>discriminated.
>
>Now the curvature (wrt c) of ln p(x|c) is -@@ ln p(x|c),
>where @ denotes derivative wrt c. This is sometimes
>called the "observed information" about c provided by the
>given observation x. The Fisher information is now the
>sampling average of this, namely I(c)=E[-@@ ln p(x|c)].
>
>It's also interesting to notice that ln p(x|c) expanded
>about a maximum at, say c0, is
>
>ln p(x|c) = ln p(x|c0) - (1/2)I(c0)(c-c0)^2 + ...
>or
>p(x|c) = p(x|c0)*exp[-(1/2)I(c0)(c-c0)^2 + ...]
>
>which, to a Bayesian, opens a door in some circumstances
>for approximating the posterior distribution of c given x
>as Normal with mean E(c|x)=c0, variance var(c|x)=1/I(c0).
>
>BTW,
>E[-@@ ln p(x|c)] = var[@ ln p(x|c)] = E[(@ ln p(x|c))^2]
>
>where the expectation E and variance var are wrt the
>sampling distribution p(x|c).
>
>--
> r e s (Spam-block=XX)
>
>
>Stephen Paul King <stephenk1@home.com> wrote ...
>> stephenk1@home.com (Stephen Paul King) wrote:
>[...]
>> I have assembled a link page on Fisher information and have a
>> definition: "The Fisher Information about a parameter is defined to
>> be \theta the expectation of the second derivative of the
>> loglikelihood."
>> http://members.home.net/stephenk1/Outlaw/fisherinfo.html
>> But I am still needing an intuitive grasp of that it means. :)
>
>
>

Next message: Matti Pitkanen: "[time 282] Re: [time 281] Re: Fisher information"
Previous message: Stephen P. King: "[time 280] Do Questions Asked Define The Laws Of Physics?"
Next in thread: Matti Pitkanen: "[time 282] Re: [time 281] Re: Fisher information"

This archive was generated by hypermail 2.0b3 on Sun Oct 17 1999 - 22:10:30 JST