22C3 - 2.2
22nd Chaos Communication Congress
Private Investigations
Referenten | |
---|---|
Victor Eliashberg |
Programm | |
---|---|
Tag | 1 |
Raum | Saal 1 |
Beginn | 21:00 |
Dauer | 01:00 |
Info | |
ID | 464 |
Veranstaltungstyp | Vortrag |
Track | Science |
Sprache | englisch |
Feedback | |
---|---|
Haben Sie diese Veranstaltung besucht? Feedback abgeben |
On working memory and mental imagery
How does the brain learn to think?
A representation of an untrained human brain, call it B(0), is encoded in the human genome -- its size can hardly exceed a few megabytes. In contrast, a representation of a trained brain, B(t), after big enough time t (say t=20years) must be very long (terabytes?) – it must include a representation of the brain's individual experience. How can a "simple" B(0) change into an extremely complex B(t) in the course of learning?
Consider a cognitive system (W,D,B), where W is an external world, D is a set of human-like sensory and motor devices, and B is a control system simulating the work of human nervous system (for simplicity, B will be referred to as the brain). System (D,B) can be thought of as a human-like robot. Let us divide (W,D,B) into two subsystems: the brain, B, and the external world, W, as it appears to the brain via devices D – subsystem (W,D). In this representation, both subsystems can be treated as abstract "machines," the inputs of B being the outputs of (W,D), and vice versa. Let B(t) denote the state of B at moment t, where t=0 is the beginning of learning. The talk promotes the following general propositions:
There must exist a relatively short formal representation of B(0). This representation is encoded, in some form, in the human genome and can be short enough to fit into a single floppy disk (megabytes). Any formal representation of B(t) with a big t, say t=20years, must be very long (terabytes) -- B(t) must include a representation of a very large individual experience.
Let B(t)=(H(t),S(t)), where H(t) is a representation of “brain hardware” (e.g., in the form of a neural network model), and S(t) is a representation of the “brain software” (e.g., in the form of a set of synaptic gains). The hardware H(t) is close to H(0) -- the main difference is between initial software S(0) and the software S(t) created in the course of learning.
The right methodology should be directed at reverse engineering B(0)=(H(0), S(0)). It is practically impossible to find and understand S(t) without first finding S(0) and understanding the process of learning that transforms S(0) into S(t). To find B(0) one needs to rely on a combination of psychological and neurobiological data. Ignoring psychological data leads to the, so-called, "mindless brains," whereas ignoring neurobiological data leads to the, so-called, "brainless minds."Traditional Artificial Neural Networks (ANN) and Artificial Intelligence (AI) research had fallen pray to this methodological pitfall. To make a big progress in reverse engineering (hacking!) B(0) and, consequently, in simulating and understanding a broad range of nontrivial cognitive phenomena in system (W,D,B(t)) it is critically important to develop a unified integrated approach to brain modeling and cognitive modeling!
The talk discusses the following fundamental problems that must be addressed by the above unified integrated approach:
1. What is working memory and mental imagery? How, can our brain learn to imagine a process of writing and erasing symbols on a sheet of paper, or to move chess pieces on an imaginary chess board? 2. Importantly, the behavior from item 1, requires the highest general level of computing power (Chomsky’s type 0). How can a neural network model learn to perform behavior of type 0? It is easy to show that the error minimization learning algorithms employed in traditional neural network models cannot answer the above question. (These algorithms cannot be use to learn behavior higher than type 3!) 3. An experienced chess player can mentally play a combinatorial number of different chess parties. At the same time, he/she can recall the real chess parties he/she played. How can our brain combine these two properties? 4. The problem of pattern recognition is traditionally treated as a problem of optimal classification. This general approach was called into question by neurophysiologists Zopf Jr. (1962) in his paper entitled "Attitude and Context." (The paper was largely ignored!) Zopf argued that, in the case of the human brain, there is no such thing as an optimal context-independent classification. The fact is that we can treat a given object as a member of a combinatorial number of different classes depending on our attitude (mental set). What is mental set? How can a computing system with a linearly growing size of knowledge (software) dynamically reconfigure this knowledge to match a combinatorial number of different contexts?