[time 293] Re: [time 291] Re: Expressiveness of interactive computing


Peter Wegner (pw@cs.brown.edu)
Sun, 9 May 1999 12:31:34 -0400 (EDT)


Stephen
        My comments on your comments in the text below are indicated by **.
        It is amazing that you have immediately understood exactly what I was trying to say.
        peter

>From stephenk1@home.com Sun May 9 11:03 EDT 1999
Date: Sun, 09 May 1999 11:04:39 -0400
From: "Stephen P. King" <stephenk1@home.com>
Organization: OutLaw Scientific
X-Mailer: Mozilla 4.5 [en]C-AtHome0405 (Win95; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Peter Wegner <pw@cs.brown.edu>
CC: time@kitada.com
Subject: Re: [time 291] Re: Expressiveness of interactive computing
Content-Transfer-Encoding: 7bit

Dear Peter,

        This result goes toward my claim that Local Systems interactively
construct their space-times! :) I have some clarification questions and
comments.

Peter Wegner wrote:
>
> Here is a discussion of expressiveness of interactive computing which you may wish to comment on.
> The idea of defining expressiveness of finite computing agents relative to its observers (environments) rather than in terms of an absolute notion of transformation power captures the idea that expressiveness is meaningful only if it can be exercised in its environment.
> peter
>
> Expressiveness of Interactive Computing Agents
>
> The expressiveness of finite computing agents is a relative notion, defined relative to an environment. It extends the algorithmic notion of expressiveness, defined in terms of transformation power, to a notion that measures their ability to distiguish their environments. The extended notion reduces to the algorithmic notion for environments that are TMs, but differs for computing agents whose environments are SIMs or the physical world W. In particular, the extended notion is applicable to computing agents that aim to model the physical world.
>
> Given two finite agents M1, M2, their behavior BX(M1), BX(M2) is distinguishable relative to an environment (observer, tester) X if the exclusive-or BX(M1) XOR BX(M2) is nonempty. Members of this set are called distinguishability certificates. Finite agents are equivalent for X if their behavior cannot be distinguished by a distinguishability certificate. Agents equivalent relative to a restricted environment X1 may be distinguishable relative to a more complex environment X2, just as for people.

        What is the measure of the information contained in the
distinguishability certificates (DC), e.g. how do we quantitatively
represent the information content of DCs?

** Information content depends on particular processes of observation, though
the length of DCs has information measure log(k).
For sequential observers (SIMs) observations are I/O streams (i1,o1), (i2,o2), ..., where ik may depend on previous outputs and on unpredictable temporal events. Because information can include events that occur during the process of computation, more information is available than for algorithms, and more distinctions can be made. Additional information is particularly useful if it allows us to make local decisions that remain globally valid (see later discussion). For example, driving is an example where this is ususally, though not universally true, and evolution is an example of a natural process where this occurs.
        Let me know if you have more thoughts about information measures.
 
> Expressiveness of finite computing agents, formally defined by observation equivalence, measures their ability to make observational distinctions about their environment or equivalently the ability of external observers to distinguish agent behavior [WG].
>
> Observation equivalence: A class CS of systems and a class CE of environments determines an observation-equivalence relation E(CS, CE) such that systems S, S' in CS are equivalent iff they are indistinguishable for all e in CE. Environments e, e' in CE are equivalent if they are indistinguishable for all S in CS.

        Is there a limit on the cardinality of the CS and CS classes?

** The classes CS and CE can be nonenumerable, having the cardinality of the reals. In particular if CE is the real world environment W, the number of possible worlds is nonenumerable. The fact that the real world has the cardinality of the real numbers is clearly something that was understood by mathematicians when the term "real numbers" was introduced.
 
> Machines induce an equivalence relation on an environment class CE such that machine M1 is more expressive than M2 if the equivalence relation E(M1, CE) induced by M1 on CE is strictly finer than E(M2, CE). Greater expressiveness of machines implies greater distinguishability among environments, which in turn means that more expressive machines can solve larger classes of computational problems.

        Could we extend this into discussions of P=NP, in terms of quantifying
the ratio of expressiveness of a given pair of IMs?

** We can quantify complexity but I am not sure what it means to quantify expressiveness. The following discussion from my paper, which I did not include in my previous message, introduces the notion of competitive ratio, and discusses its limitations as a measure of on-line relative to off-line expressiveness. the book by Borodin and El Yaniv [BE, Cambridge 1999] is worth getting.
[quote from my paper] It is instructive to examine the relation between our notion of interactive computation and on-line algorithms that arise in competitive analysis, which measure algorithm performance relative to best possible performance of an algorithm that has complete knowledge [BE]. The extra cost incurred by on-line algorithms in discovering knowledge about their environment already known to off-line algorithms is measured by the competitive ratio, which compares optimal off-line cost when knowledge is complete with on-line cost. However, such comparisons assume that an off-line algorithm with complete knowledge exists - an assumption that does not hold for inherently interactive problems for which complete knowledge requires knowledge of the future. Interactive problems generally do not have a perfect information model that can serve as a baseline for competitive analysis, and cannot be expressed as combinatorial optimization problems whose solutions are expressible!
 as functions from integers to integers.
The result that interaction reduces complexity at first appears to conflict with the results of competitive analysis, where on-line computations increase complexity compared with complete information off-line computation. However, a careful analysis shows that models of computation are fundamentally different, and that delay of binding time of interactive decisions decreases complexity when interactive information provided by nature provides a clue to the next action. Though such interactive computations converge to local rather than global optima, they are effective in natural processes like evolution and protein folding, as well as in processes like driving, in providing an adequate basis for interactive decision making.
[end quote]

> Environments of computing agents can be machines or the physical world. The following environment classes for finite computing agents that are progressively more demanding in specifying progressively larger classes of interactions that agents can perform:
>
> the class of Turing machines TM (interaction specified by I/O pairs)
> the class of SIMs (interaction specified by I/O streams)
> the class of MIMs (interaction specified by multiple I/O streams)
> the physical world W (interaction at least as demanding as MIMs)
>
> Behavior of agents in a TM environment is expessed by input-output pairs, while behavior in a SIM environment is expressed by sequences of observations (I/O streams) that allow more finely-grained distinctions among behaviors than single observations. Nonequivalence of two SIMs can be detected by a finite distinguishability certificate (observation sequence) but equivalence cannot be finitely detected.

        I see this last as an expression of how a given theory can only be
falsified, never proven. ;)
 
** Absolutely right. This relates to Karl Poppers book on Scientific Discovery (I was a colleague of Karl Popper's at the London School of Economics for 3 years from 1961 to 1964).

> Actually observed behavior BX(S) of a system S and an environment (observer) X is the intersection of system and observer behavior. Sequential interaction can be modeled as a producer/consumer relation, where X is a producer of behavior, S is a consumer of behavior, and the observed behavior BX(S) is the smaller of produced and consumed behavior. If observers or testers of an environment are TMs, then observed behaviors of any class of systems, such as interactive computing agents or the real world, will be limited to TMs.
> When the environment with which a finite agent interacts is a TM, then behavior richer than a TM cannot be detected, just as the intelligence of a human agent cannot be detected if the agent is never given tasks that require intelligence.

        This seems to speak to the limitations on observables given an
experimental apparatus. We see in QM that "what can be seen depends on
how one looks".

** Extending finite agents with observation tools like microsocopes, telescopes, and infra-red sensors is the dual of extending the richeness of environments so that agents can realize their full potential. You cannot tell who will be good at a task unless you test them in the actual job, and you cannot tell who your true friends are until you test them under conditions of adversity or extreme conditions.
 
> Since computational models aim to express properties of real-world environments, W is the most demanding of the environments considered here. Interaction machines that interact with the real world W, have the property that E(SIM, W) is a finer equivalence relation on the real world that E(TM, W). SIMs permit a finer granularity equivalence relation for real-world environments than TMs and MIMs permit a finer granularity relation than SIMs. Agents that model behavior of real-world environments W impose an expressiveness hierarchy such that TMs, SIMs, and MIMs are progressively more expressive finite agents, able to make progressively finer distinctions about the world and solve larger classes of problems.
>
> BW(TMs) < BW(SIMs) < BW(MIMs)
>
> However, for environments that are TMs, the equivalence relation induced by SIMs is no finer than that induced by TMs, while for SIM environments, MIMs are no more expressive than SIMs. To observe a given level of expressiveness of a finite agent it is necessary to have a testing environment with at least that expressiveness. Failure to observe that computers have greater expressive power than TMs was due in large measure to the fact that observation environments (testing environments) were restricted to be TMs.

        We see this occurring in physics in the assumptions of unique
initiality and/or finality of time for all E(CS, CE) since they assume
that the *complete* class of CS has as a *unique* support a *unique*
class of CE and that such is a priori!

** I am not familiar with the models to which you refer and cannot therefore understand this point. Maybe you can refer me to a reference.
 
> Let SIMk be the class of SIMs restricted to no more than k interactions. We can show that:
>
> For all k>0, BW(SIMk) < BW(SIMk+1); and BW(TM) = BW(SIM1)
>
> The distinctions that can be made for real world environments, or environments with at least the expressive > power of SIMs, is greater for SIMk+1 than for SIMk, and TMs are at the bottom of this hierarchy. The behaviors BW(SIMk) of SIMs that can make k observations determine an infinite expression hierarchy with TMs at the bottom (corresponding to behavior BW(SIM1)).
>
> This may be proved by example. Consider an environment of SIMs with buffers of length k that buffer the output of their first k-1 interactions and disgorge their first output only on the kth interaction. For this environment, machines in the class SIMk can distinguish machines with buffers of length k by distinguishability certificates of length k, but cannot distinguish maong machines with bufers of length k+1. As a second example, consider an environment of machines Mk with binary inputs that print a 1 whenever the last k interactive inputs are 1 and a 0 otherwise. The shortest distinguishability certificate between Mk+! and Mk has length k+1. Note that SIMk must have a memory of at least size log(k) to track distinguishability certificates of length k, so that the state for the classes SIMk is not bounded.

        This last gets to my question of the information content of DCs. :) How
would we generically model the "memory" of a SIMk and a MIMk (if there
is such a thing ;) )? Can we speak to thermodynamic and Shannon entropy
in terms of memory "size"?

** Maybe my earlier discussion of the nature of "stream" information can help
you answer this question.
 
> SIMs model interactive question answering. The expressiveness hierarchy for SIMs shows that on-line questioning that makes use of follow-up questioning has greater expressiveness than off-line questioning. Ken Starr can obtain more information about questioned subjects (Clinton) by interactive than noninteractive questioning because later questions can use information not available to noninteractive questioners. Strategies that permit k+1 interactive questions can elicit more information (about Clinton) than k questions, assuming that Clinton has at least the expressive power of a SIM. The strategy used by the Senate of asking a list of 81 predetermined questions was predictably ineffective because it is an off-line strategy that corresponds to asking a single complex question with many parts.

        Is it clear that interactive question answering is such that it can't
be assumed to be an a priori observable? This, IMHO, speaks to how a
branching time formalism allows us to model a flowing time.
 
** To formally prove that interactive Q/A is more expressive we need to express the problem in terms of SIMs and use the "proof by example" given earlier to show that if both Starr and Clinton can be modeled by SIMs, Starr will be able to make distinctions about Clinton not possible by noninteractive questioning.

> Interaction during processes like driving, where looking out of the window can elicit information not available prior to the start of the journey, not only increases driving efficiency but allows a larger class of problems to be solved. When local computation of the next step of an interactive process is globally the correct thing to do, as is usually the case in driving, then NP-complete algorithmic tasks can be reduced to interactive tasks of polynomial (usually linear) complexity [We2].
> However, interaction not only dramatically reduces the complexity for this class of problems but increases the class of problems that can be solved to include temporal domains where complete information is not available prior to the computation. For domains determined by a static map, the problem of driving from A to B can be algorithmically specified but is more simply specified interactively, while dynamic domains with traffic conditions not specified in advance cannot be algorithmically specified, even in principle, and require interactive specification.
>
> Problems like protein folding are NP-complete as combinatorial optimization problems, but nature finds a way of naturally folding DNA strands so they always fold in the same way [Pa2]. We hypothesize that nature does not solve such problem algorithmically, but that it folds DNA strands by a robust interactive process having the property that local small interactive steps yield the same folded configuration for a large (robust) collection of initial configurations. Dynamic programming is an example of a process that yields global optima by sequences of local optimization steps. Its use as a practical model for learning reflects the fact that nature "prefers" processes, like evolution, whose local steps yield stable local optima.

        This is remarkably similar to that I am trying to communicate about
with regards to LSs; that the interactions/communications among LSs
constructs space-times. These are the mutually perceivable environments
within which observers (LSs) frame their perceptions. Al Martel and I
spoke yesterday afternoon about how the increase in knowledge content of
an LS may be related to the perception of an expanding space-time... And
perhaps cross-talk noise is the cause for the background black body
radiation...
        Thank you for this result! :)

** Thanks for appreciating these comments. This encourages me to think they may be appreciated by the larger scholarly community.

Kindest regards,

Stephen



This archive was generated by hypermail 2.0b3 on Sun Oct 17 1999 - 22:10:31 JST