This series begins with The Machine in the Ghost
Back to Roughing Out a Theory of Brain || On to The Brain and Reality
The benefits and risks of abstraction
Individual nerve cells are the building blocks of nervous systems (Hypothesis -1), so to build an artificial nervous system, we need an artificial neuron. There are many different kinds of biological neurons, perhaps even more than we know about. They are all very complex in structures and there is much nonlinearity in the way they work. Moreover, each neuron in a network is ultimately unique. To make a theoretical model of a neuron that can generate functional models that we can actually construct, we must examine the complex structure and function of natural neurons in as much detail as technology allows and then eliminate or blur certain details to get a much simpler and understandable description that still captures the essential features and functions. This kind of abstraction always means leaving something out, and the success of the model depends on making good decisions about what can be ignored or simplified and what must remain just as it is. The effects of ignoring or modifying details can range from insignificant to fatal.
Is an artificial neuron a real neuron?
That a neuron model can be more or less successful might make us wonder if artificial neurons can ever be `real' neurons, and by extension, can networks of artificial neurons be real neural networks and can artificial intelligences be real intelligences. `Artificial' has a strong connotation of `not real' in modern usage. An artificial plant is not a real plant and an artificial leg is not a real leg. A description of a thing is not the thing, nor is a simulation of a thing the thing. We will escape the mire that a philosophical answer to this question presents by taking a pragmatic, functionalist stance: the proof of the pudding is in the eating.
The value of a theoretical model is its ability to explain and predict, and one good way to test that value is to implement the theoretical model as a functional model and observe its behavior. Implementing a functional neuron model on a general-purpose computer or other non-biological hardware may introduce practical constraints that force even further simplification. In developing a theory by abstraction of biological structures and testing it in functional `artificial' implementations, it is therefore essential to make all abstraction decisions explicit and revisit them when things go wrong.
Neuron complexity
Biological neurons are not just simple, identical components in a complex network of networks. Each neuron is a complete electro-chemical analog computing system that takes inputs, computes specific mathematical functions with them, and outputs a result. Neurons are classed into many different types according to their shapes, functions and what they connect to. Furthermore, the neurons within types are individually as unique as snowflakes.
A single neuron can have as many as 150,000 inputs and outputs or as few as perhaps hundreds. That is hard to visualize, but consider that you have far fewer than 150,000 hairs on your head, even with a full head of hair. Each input to a neuron in a network can have a stronger or weaker effect on the neuron than other inputs for various reasons, and the strength of each input connection can vary over time. That is in fact the basis for neural adaptation and learning. The effect of an input can also be either negative or positive (stimulating or inhibitive in neurological terms). The nerve cell body processes all the input and generates an output based on the result. The output is generally a complex series of voltage spikes, a staccato burst that travels down the neuron's axon to reach many other neurons.
There is so much data on the structure and electrochemistry of neurons down to even molecular levels that a very fine-grained model of a single cell can be constructed. In fact, there is an impressively ambitious project to create a functional model a 10,000-neuron unit of the cerebral cortex at such a low level. The problem with such fine-grained computational models is the immense amount of processing power and memory needed. The IBM Blue Gene supercomputing system used in the Blue Brain Project has over 8,000 processors and generates hundreds of gigabits of data in even short runs. Scale that up to the 100 billion or so neurons that we have in a human brain and the problem becomes obvious. The only solution is to find a compromise between abstraction and detail that makes implementation practical and provides acceptable performance of the functional model. That is the challenge we face.
A generic computational neuron model
A basic element of our theory is a highly abstract model of the very complex chain of events that unfolds between the input to a neuron and the neuron's output. We will reduce that process down to three stages.
- The input stage describes transfer of signals from sensory cells or other neurons through dendrite synapses. Generally speaking, each synapse can amplify or attenuate the signal coming into it, so the inputs are each multiplied by a weight value. This weighting captures all of the effective variation over the individual inputs up to the integration stage.
- The integration stage describes how the cell body integrates the signals to produce a value that is used to determine whether or not the neuron sends its output signal on to other neurons (`fires').
- The output stage takes the result of the integration level and generates the neuron's output to other neurons or to effectors.
Each stage of processing is represented as a mathematical function of the stage that comes before it (Fig. 2).
Figure 2. Abstraction of the Neuron
This generic model allows us to represent any kind of neuron and any individual neuron with three computable mathematical expressions. Despite the extreme simplicity of this abstraction, it is still capable of expressing an arbitrary degree of detail by making the mathematical function at each level as simple as desired or as complex as necessary.
The many specific neuron models that have been proposed over the decades fall into two categories according to their purpose. The most highly detailed models are based on the conductance properties of the membranes of individual neurons and are generally grounded in actual measurements. Their purpose is to understand and describe the dynamic behavior of single neurons. The other category comprises the formal models, which abstract the workings of individual neurons to mathematical functions that are simple enough to allow simulation of very large networks of neurons. Without going into a lot of detail, let's see how a few important classes of existing models at different levels of description fit into our generic model.
Hodgkins-Huxley model
This seminal model was developed empirically from actual measurements made on a type of squid neuron called the giant neuron in 1952 (Hodgkin and Huxley, A quantitative description of membrane current and its application to conduction and excitation in nerve., J. Physiol. 117). It is important as the first model that uses the flow of ions across the cell membrane as the basis for deriving an equation that accurately relates the inputs to a nerve cell to its output, and the model is most simply represented by a simple schematic electric circuit diagram. The original equation involves at least four variables and its computation is expensive in terms of computing resources. While it is a powerful model for the behavior of individual neurons of well-defined physical structure, it is not suitable for computational modeling of any but the smallest and simplest neural networks.
Integrate and fire models
These models are abstracted to a higher level than the Hodgkins-Huxley neuron. They are based on a simple electrical circuit model that was first proposed at the beginning of the 20th century by the physiologist Louis Lapicque ("Quantitative investigations of electrical nerve excitation treated as polarization," Biol. Cybern. 2007 Dec. 97(5-6):341-9). The major contribution of this model is to present the neuron as integrating its many inputs and firing off a signal when some threshold is reached. These models are simple enough to serve as the basis of computing the behavior of medium to large artificial neural network.
McCulloch-Pitts model
This model was published in 1943 and is the progenitor of all early work in computed neural network simulations (McCulloch and Pitts, "A logical calculus of the ideas immanent in nervous activity," Bulletin of Mathematical Biophysics, 5:115-133). McCulloch, a neuroscientist, and Pitts, a logician, developed this model neuron as an attempt to understand how the brain can produce complex behavior through the cooperation of simple elements (neurons). This model, also called a threshold logic unit, presents the neuron as a binary logical device whose output is all or nothing (1 or 0, true or false, etc). This formalism began a branch in the path of neural network research that eventually came to be called connectionism.
A particular theoretical inability of networks of McCulloch-Pitts neurons to solve certain classes of interesting problem was demonstrated by Minsky and Papert in their famous book (Perceptrons, An Introduction to Computational Geometry, by Marvin Minsky and Seymour Papert. MIT Press, Cambridge, Mass., 1969). That put connectionism on a sidetrack and arguably stunted the growth of the general field of artificial intelligence for decades. It was perhaps also responsible for a shift in connectionist focus to theoretical problem solving and a willingness to sacrifice biological fidelity. Connectionist thinking is characteristically mathematical and doesn't look much to biophysics for insight.
LEABRA
Leabra is an algorithm for computational neurocognitive science. The name stands for several principles that establish a computational framework (Local, Error-driven, Associative, Biologically-Realistic, and Algorithm). This is a key project that pursues computational neuroscience at higher levels of cognition while maintaining contact with solid biophysical ground. It tackles two important and longstanding problems. One is achieving a learning mechanism based on error with biologically plausible computations and the other is achieving harmony between competitive network structures and distributed representation. Leabra is not designed to encompass the working of the entire brain. Like all computational approaches, it focuses on the regular structures found in the cortex, where perhaps nearly all of the massively parallel processing is done (Computational Explorations in Cognitive Neuroscience, O'Reilly and Munakata, MIT Press, 2000).
Our own investigations concern the entire brain from the beginning. We will always work with a complete organism and its interaction with a real environment in a full survival loop as described above. We will feel free to use any and all modeling frameworks that are able to solve our engineering problems and are compatible with our developing theory. As a default starting point, however, we will use Leabra and add, subtract or modify as failure demands. We will try to incorporate genetic mechanisms to guide development.
Back to Roughing Out a Theory of Brain || On to The Brain and Reality
This series begins with The Machine in the Ghost