Category Archives: learning

Common mechanisms for learning (past), navigation(present) and dreams (future?)

Sorry for the brief(?) hiatus. I have left my day job to start a venture and so am a bit preoccupied. Hopefully, the mouse trap should benefit from the new arrangements.
Today I would like to highlight a recent study from MIT that once again highlighted the fact that the same brain mechanisms are used for envisaging the future as are used for reminiscing about the past.  The study was performed on rats and found that the rats sort of replayed their day-time navigational memories while they were dreaming. This in itself is not a new news and has been known for a long time; what they found additionally is that the rats also , sort of replayed the navigational memories/ alternatives in their head at a faster rate, to sort of think and plan ahead. This use of replaying the traces to think ahead to me is very important and cements the role of default netwrok in remebering the poast and envisaging the future.

When a rat moves through a maze, certain neurons called “place cells,” which respond to the animal’s physical environment, fire in patterns and sequences unique to different locations. By looking at the patterns of firing cells, researchers can tell which part of the maze the animal is running.

While the rat is awake but standing still in the maze, its neurons fire in the same pattern of activity that occurred while it was running. The mental replay of sequences of the animals’ experience occurs in both forward and reverse time order.

“This may be the rat equivalent of ‘thinking,'” Wilson said. “This thinking process looks very much like the reactivation of memory that we see during non-REM dream states, consisting of bursts of time-compressed memory sequences lasting a fraction of a second.

“So, thinking and dreaming may share the same memory reactivation mechanisms,” he said.
“This study brings together concepts related to thought, memory and dreams that all potentially arise from a unified mechanism rooted in the hippocampus,” said co-author Fabian Kloosterman, senior postdoctoral associate.

The team’s results show that long experiences, which in reality could have taken tens of seconds or minutes, are replayed in only a fraction of a second. To do this, the brain links together smaller pieces to construct the memory of the long experience.

The researchers speculated that this strategy could help different areas of the brain share information – and deal with multiple memories that may share content – in a flexible and efficient way. “These results suggest that extended replay is composed of chains of shorter subsequences, which may reflect a strategy for the storage and flexible expression of memories of prolonged experience,” Wilson said.

To me this seals the fate of hippocampus as not just necessary for formation of new memories, but also for novel future-oriented thoughts and imaginations.

Major conscious and unconcoscious processes in the brain: part 3: Robot minds

This article continues my series on major conscious and unconscious processes in the brain. In my last two posts I have talked about 8 major unconscious processes in the brain viz sensory, motor, learning , affective, cognitive (deliberative), modelling, communications and attentive systems. Today, I will not talk about brain in particular, but will approach the problem from a slightly different problem domain- that of modelling/implementing an artificial brain/ mind.

I am a computer scientist, so am vaguely aware of the varied approaches used to model/implement the brain. Many of these use computers , though not every approach assumes that the brain is a computer.

Before continuing I would briefly like to digress and link to one of my earlier posts regarding the different  traditions of psychological research in personality and how I think they fit an evolutionary stage model . That may serve as a background to the type of sweeping analysis and genralisation that I am going to do. To be fair it is also important to recall an Indian parable of how when asked to describe an elephant by a few blind man each described what he could lay his hands on and thus provided a partial and incorrect picture of the elephant. Some one who grabbed the tail, described it as snake-like and so forth.

With that in mind let us look at the major approaches to modelling/mplementing the brain/intelligence/mind. Also remember that I am most interested in unconscious brain processes till now and sincerely believe that all the unconscious processes can, and will be successfully implemented in machines.   I do not believe machines will become sentient (at least any time soon), but that question is for another day.

So, with due thanks to @wildcat2030, I came across this book today and could immediately see how the different major approaches to artificial robot brains are heavily influenced (and follow) the evolutionary first five stages and the first five unconscious processes in the brain.
The book in question is ‘Robot Brains: Circuits and Systems for Conscious Machines’ by Pentti O. Haikonen and although he is most interested in conscious machines I will restrict myself to intelligent but unconscious machines/robots.

The first chapter of the book (which has made to my reading list) is available at Wiley site in its entirety and I quote extensively from there:

Presently there are five main approaches to the modelling of cognition that could be used for the development of cognitive machines: the computational approach (artificial intelligence, AI), the artificial neural networks approach, the dynamical systems approach, the quantum approach and the cognitive approach. Neurobiological approaches exist, but these may be better suited for the eventual explanation of the workings of the biological brain.

The computational approach (also known as artificial intelligence, AI) towards thinking machines was initially worded by Turing (1950). A machine would be thinking if the results of the computation were indistinguishable from the results of human thinking. Later on Newell and Simon (1976) presented their Physical Symbol System Hypothesis, which maintained that general intelligent action can be achieved by a physical symbol system and that this system has all the necessary and sufficient means for this purpose. A physical symbol system was here the computer that operates with symbols (binary words) and attached rules that stipulate which symbols are to follow others. Newell and Simon believed that the computer would be able to reproduce human-like general intelligence, a feat that still remains to be seen. However, they realized that this hypothesis was only an empirical generalization and not a theorem that could be formally proven. Very little in the way of empirical proof for this hypothesis exists even today and in the 1970s the situation was not better. Therefore Newell and Simon pretended to see other kinds of proof that were in those days readily available. They proposed that the principal body of evidence for the symbol system hypothesis was negative evidence, namely the absence of specific competing hypotheses; how else could intelligent activity be accomplished by man or machine? However, the absence of evidence is by no means any evidence of absence. This kind of ‘proof by ignorance’ is too often available in large quantities, yet it is not a logically valid argument. Nevertheless, this issue has not yet been formally settled in one way or another. Today’s positive evidence is that it is possible to create world-class chess-playing programs and these can be called ‘artificial intelligence’. The negative evidence is that it appears to be next to impossible to create real general intelligence via preprogrammed commands and computations.

The original computational approach can be criticized for the lack of a cognitive foundation. Some recent approaches have tried to remedy this and consider systems that integrate the processes of perception, reaction, deliberation and reasoning (Franklin, 1995, 2003; Sloman, 2000). There is another argument against the computational view of the brain. It is known that the human brain is slow, yet it is possible to learn to play tennis and other activities that require instant responses. Computations take time. Tennis playing and the like would call for the fastest computers in existence. How could the slow brain manage this if it were to execute computations?

The artificial neural networks approach, also known as connectionism, had its beginnings in the early 1940s when McCulloch and Pitts (1943) proposed that the brain cells, neurons, could be modelled by a simple electronic circuit. This circuit would receive a number of signals, multiply their intensities by the so-called synaptic weight values and sum these modified values together. The circuit would give an output signal if the sum value exceeded a given threshold. It was realized that these artificial neurons could learn and execute basic logic operations if their synaptic weight values were adjusted properly. If these artificial neurons were realized as hardware circuits then no programs would be necessary and biologically plausible artificial replicas of the brain might be possible. Also, neural networks operate in parallel, doing many things simultaneously. Thus the overall operational speed could be fast even if the individual neurons were slow. However, problems with artificial neural learning led to complicated statistical learning algorithms, ones that could best be implemented as computer programs. Many of today’s artificial neural networks are statistical pattern recognition and classification circuits. Therefore they are rather removed from their original biologically inspired idea. Cognition is not mere classification and the human brain is hardly a computer that executes complicated synaptic weight-adjusting algorithms.

The human brain has some 10 to the power of 11 neurons and each neuron may have tens of thousands of synaptic inputs and input weights. Many artificial neural networks learn by tweaking the synaptic weight values against each other when thousands of training examples are presented. Where in the brain would reside the computing process that would execute synaptic weight adjusting algorithms? Where would these algorithms have come from? The evolutionary feasibility of these kinds of algorithms can be seriously doubted. Complicated algorithms do not evolve via trial and error either. Moreover, humans are able to learn with a few examples only, instead of having training sessions with thousands or hundreds of thousands of examples. It is obvious that the mainstream neural networks approach is not a very plausible candidate for machine cognition although the human brain is a neural network.

Dynamical systems were proposed as a model for cognition by Ashby (1952) already in the 1950s and have been developed further by contemporary researchers (for example Thelen and Smith, 1994; Gelder, 1998, 1999; Port, 2000; Wallace, 2005). According to this approach the brain is considered as a complex system with dynamical interactions with its environment. Gelder and Port (1995) define a dynamical system as a set of quantitative variables, which change simultaneously and interdependently over quantitative time in accordance with some set of equations. Obviously the brain is indeed a large system of neuron activity variables that change over time. Accordingly the brain can be modelled as a dynamical system if the neuron activity can be quantified and if a suitable set of, say, differential equations can be formulated. The dynamical hypothesis sees the brain as comparable to analog feedback control systems with continuous parameter values. No inner representations are assumed or even accepted. However, the dynamical systems approach seems to have problems in explaining phenomena like ‘inner speech’. A would-be designer of an artificial brain would find it difficult to see what kind of system dynamics would be necessary for a specific linguistically expressed thought. The dynamical systems approach has been criticized, for instance by Eliasmith (1996, 1997), who argues that the low dimensional systems of differential equations, which must rely on collective parameters, do not model cognition easily and the dynamicists have a difficult time keeping arbitrariness from permeating their models. Eliasmith laments that there seems to be no clear ways of justifying parameter settings, choosing equations, interpreting data or creating system boundaries. Furthermore, the collective parameter models make the interpretation of the dynamic system’s behaviour difficult, as it is not easy to see or determine the meaning of any particular parameter in the model. Obviously these issues would translate into engineering problems for a designer of dynamical systems.

The quantum approach maintains that the brain is ultimately governed by quantum processes, which execute nonalgorithmic computations or act as a mediator between the brain and an assumed more-or-less immaterial ‘self’ or even ‘conscious energy field’ (for example Herbert, 1993; Hameroff, 1994; Penrose, 1989; Eccles, 1994). The quantum approach is supposed to solve problems like the apparently nonalgorithmic nature of thought, free will, the coherence of conscious experience, telepathy, telekinesis, the immortality of the soul and others. From an engineering point of view even the most practical propositions of the quantum approach are presently highly impractical in terms of actual implementation. Then there are some proposals that are hardly distinguishable from wishful fabrications of fairy tales. Here the quantum approach is not pursued.

The cognitive approach maintains that conscious machines can be built because one example already exists, namely the human brain. Therefore a cognitive machine should emulate the cognitive processes of the brain and mind, instead of merely trying to reproduce the results of the thinking processes. Accordingly the results of neurosciences and cognitive psychology should be evaluated and implemented in the design if deemed essential. However, this approach does not necessarily involve the simulation or emulation of the biological neuron as such, instead, what is to be produced is the abstracted information processing function of the neuron.

A cognitive machine would be an embodied physical entity that would interact with the environment. Cognitive robots would be obvious applications of machine cognition and there have been some early attempts towards that direction. Holland seeks to provide robots with some kind of consciousness via internal models (Holland and Goodman, 2003; Holland, 2004). Kawamura has been developing a cognitive robot with a sense of self (Kawamura, 2005; Kawamura et al., 2005). There are also others. Grand presents an experimentalist’s approach towards cognitive robots in his book (Grand, 2003).

A cognitive machine would be a complete system with processes like perception, attention, inner speech, imagination, emotions as well as pain and pleasure. Various technical approaches can be envisioned, namely indirect ones with programs, hybrid systems that combine programs and neural networks, and direct ones that are based on dedicated neural cognitive architectures. The operation of these dedicated neural cognitive architectures would combine neural, symbolic and dynamic elements.

However, the neural elements here would not be those of the traditional neural networks; no statistical learning with thousands of examples would be implied, no backpropagation or other weight-adjusting algorithms are used. Instead the networks would be associative in a way that allows the symbolic use of the neural signal arrays (vectors). The ‘symbolic’ here does not refer to the meaning-free symbol manipulation system of AI; instead it refers to the human way of using symbols with meanings. It is assumed that these cognitive machines would eventually be conscious, or at least they would reproduce most of the folk psychology hallmarks of consciousness (Haikonen, 2003a, 2005a). The engineering aspects of the direct cognitive approach are pursued in this book.

Now to me these computational approaches are all unidimensional-

  1. The computational approach is suited for symbol-manipulation and information-represntation and might give good results when used in systems that have mostly ‘sensory’ features like forming a mental represntation of external world, a chess game etc. Here something (stimuli from world) is represented as something else (an internal symbolic represntation).
  2. The Dynamical Systems approach is guided by interactions with the environment and the principles of feedback control systems and also is prone to ‘arbitrariness’ or ‘randomness’. It is perfectly suited to implement the ‘motor system‘ of brain as one of the common features is apparent unpredictability (volition) despite being deterministic (chaos theory) .
  3. The Neural networks or connectionsim is well suited for implementing the ‘learning system’ of the brain and we can very well see that the best neural network based systems are those that can categorize and classify things just like ‘the learning system’ of the brain does.
  4. The quantum approach to brain, I haven’t studied enough to comment on, but the action-tendencies of ‘affective system’ seem all too similar to the superimposed,simultaneous states that exits in a wave function before it is collapsed. Being in an affective state just means having a set of many possible related and relevant actions simultaneously activated and then perhaps one of that decided upon somehow and actualized. I’m sure that if we could ever model emotion in machine sit would have to use quantum principles of wave functions, entanglemnets etc.
  5. The cognitive approach, again I haven’t go a hang of yet, but it seems that the proposal is to build some design into the machine that is based on actual brain and mind implemntations. Embodiment seems important and so does emulating the information processing functions of neurons. I would stick my neck out and predict that whatever this cognitive approach is it should be best able to model the reasoning and evaluative and decision-making functions of the brain. I am reminded of the computational modelling methods, used to functionally decompose a cognitive process, and are used in cognitive science (whether symbolic or subsymbolic modelling) which again aid in decision making / reasoning (see wikipedia entry)

Overall, I would say there is room for further improvement in the way we build more intelligent machines. They could be made such that they have two models of world – one deterministic , another chaotic and use the two models simulatenously (sixth stage of modelling); then they could communicate with other machines and thus learn language (some simulation methods for language abilities do involve agents communicating with each other using arbitrary tokens and later a language developing) (seventh stage) and then they could be implemented such that they have a spotlight of attention (eighth stage) whereby some coherent systems are amplified and others suppressed. Of course all this is easier said than done, we will need at least three more major approaches to modelling and implementing brain/intelligence before we can model every major unconscious process in the brain. To model consciousness and program sentience is an uphill task from there and would definitely require a leap in our understandings/ capabilities.

Do tell me if you find the above reasonable and do believe that these major approaches to artificial brain implementation are guided and constrained by the major unconscious processes in the brain and that we can learn much about brain from the study of these artificial approaches and vice versa.

Major conscious and unconcoscious processes in the brain

Today I plan to touch upon the topic of consciousness (from which many bloggers shy) and more broadly try to delineate what I believe are the important different conscious and unconscious processes in the brain. I will be heavily using my evolutionary stages model for this.

To clarify myself at the very start , I do not believe in a purely reactive nature of organisms; I believe that apart from reacting to stimuli/world; they also act , on their own, and are thus agents. To elaborate, I believe that neuronal groups and circuits may fire on their own and thus lead to behavior/ action. I do not claim that this firing is under voluntary/ volitional control- it may be random- the important point to note is that there is spontaneous motion.

  1. Sensory system: So to start with I propose that the first function/process the brain needs to develop is to sense its surroundings. This is to avoid predators/ harm in general. this sensory function of brain/sense organs may be unconscious and need not become conscious- as long as an animal can sense danger, even though it may not be aware of the danger, it can take appropriate action – a simple ‘action’ being changing its color to merge with background. 
  2. Motor system:The second function/ process that the brain needs to develop is to have a system that enables motion/movement. This is primarily to explore its environment for food /nutrients. Preys are not going to walk in to your mouth; you have to move around and locate them. Again , this movement need not be volitional/conscious – as long as the animal moves randomly and sporadically to explore new environments, it can ‘see’ new things and eat a few. Again this ‘seeing’ may be as simple as sensing the chemical gradient in a new environmental.
  3. Learning system: The third function/process that the brain needs to develop is to have a system that enables learning. It is not enough to sense the environmental here-and-now. One needs to learn the contingencies in the world and remember that both in space and time. I am inclined to believe that this is primarily pavlovaion conditioning and associative learning, though I don’t rule out operant learning. Again this learning need not be conscious- one need not explicitly refer to a memory to utilize it- unconscious learning and memory of events can suffice and can drive interactions. I also believe that need for this function is primarily driven by the fact that one interacts with similar environments/con specifics/ predators/ preys and it helps to remember which environmental conditions/operant actions lead to what outcomes. This learning could be as simple as stimuli A predict stimuli B and/or that action C predicts reward D .
  4. Affective/ Action tendencies system .The fourth function I propose that the brain needs to develop is a system to control its motor system/ behavior by making it more in sync with its internal state. This I propose is done by a group of neurons monitoring the activity of other neurons/visceral organs and thus becoming aware (in a non-conscious sense)of the global state of the organism and of the probability that a particular neuronal group will fire in future and by thus becoming aware of the global state of the organism , by their outputs they may be able to enable one group to fire while inhibiting other groups from firing. To clarify by way of example, some neuronal groups may be responsible for movement. Another neuronal group may be receiving inputs from these as well as say input from gut that says that no movement has happened for a time and that the organism has also not eaten for a time and thus is in a ‘hungry’ state. This may prompt these neurons to fire in such a way that they send excitatory outputs to the movement related neurons and thus biasing them towards firing and thus increasing the probability that a motion will take place and perhaps the organism by indulging in exploratory behavior may be able to satisfy hunger. Of course they will inhibit other neuronal groups from firing and will themselves stop firing when appropriate motion takes place/ a prey is eaten. Again nothing of this has to be conscious- the state of the organism (like hunger) can be discerned unconsciously and the action-tendencies biasing foraging behavior also activated unconsciously- as long as the organism prefers certain behaviors over others depending on its internal state , everything works perfectly. I propose that (unconscious) affective (emotional) state and systems have emerged to fulfill exactly this need of being able to differentially activate different action-tendencies suited to the needs of the organism. I also stick my neck out and claim that the activation of a particular emotion/affective system biases our sensing also. If the organism is hungry, the food tastes (is unconsciously more vivid) better and vice versa. thus affects not only are action-tendencies , but are also, to an extent, sensing-tendencies.
  5. Decisional/evaluative system: the last function (for now- remember I adhere to eight stage theories- and we have just seen five brain processes in increasing hierarchy) that the brain needs to have is a system to decide / evaluate. Learning lets us predict our world as well as the consequences of our actions. Affective systems provide us some control over our behavior and over our environment- but are automatically activated by the state we are in. Something needs to make these come together such that the competition between actions triggered due to the state we are in (affective action-tendencies) and the actions that may be beneficial given the learning associated with the current stimuli/ state of the world are resolved satisfactorily. One has to balance the action and reaction ratio and the subjective versus objective interpretation/ sensation of environment. The decisional/evaluative system , I propose, does this by associating values with different external event outcomes and different internal state outcomes and by resolving the trade off between the two. This again need not be conscious- given a stimuli predicting a predator in vicinity, and the internal state of the organism as hungry, the organism may have attached more value to ‘avoid being eaten’ than to ‘finding prey’ and thus may not move, but camouflage. On the other hand , if the organisms value system is such that it prefers a hero’s death on battlefield , rather than starvation, it may move (in search of food) – again this could exist in the simplest of unicellular organisms.

Of course all of these brain processes could (and in humans indeed do) have their conscious counterparts like Perception, Volition,episodic Memory, Feelings and Deliberation/thought. That is a different story for a new blog post!

And of course one can also conceive the above in pure reductionist form as a chain below:

sense–>recognize & learn–>evaluate options and decide–>emote and activate action tendencies->execute and move.

and then one can also say that movement leads to new sensation and the above is not a chain , but a part of cycle; all that is valid, but I would sincerely request my readers to consider the possibility of spontaneous and self-driven behavior as separate from reactive motor behavior. 

A gene implicated in operant learning finally discovered

Till now, most of the research on learning at the molecular level or LTP/TLD has focused on classical conditioning paradigms. To my knowledge for the first time someone has started looking at whether , on the molecular level, classical conditioning , which works by associations between external stimuli, is differently encoded and implemented from operant learning , which depends on learning the reward contingencies of one’s spontaneously generated behavior.

Bjorn Brembs and colleagues have shown that the normal learning pathway implicated in classical conditioning, which involves Rugbata gene in fruit fly and works on adenylyl cyclase (AC) , is not involved in pure operant learning; rather pure operant learning is mediated by Protein Kinase C (PKC) pathways. This is not only a path breaking discovery , as it cleary shows the double dissociation showing genetically mutant flies, it is also a marvelous example fo how a beautiful experimental setup was convened to separate and remove the classical conditioning effects from normal operant learning and generate a pure operant learning procedure. You can read more about the procedure on Bjorn Brembs site and he also maintains a very good blog, so check that out too.

Here is the abstract of the article and the full article is available at the Bjorn Brembs site.

Learning about relationships between stimuli (i.e., classical conditioning ) and learning about consequences of one’s own behavior (i.e., operant conditioning ) constitute the major part of our predictive understanding of the world. Since these forms of learning were recognized as two separate types 80 years ago , a recurrent concern has been the
issue of whether one biological process can account for both of them . Today, we know the anatomical structures required for successful learning in several different paradigms, e.g., operant and classical processes can be localized to different brain regions in rodents [9] and an identified neuron in Aplysia shows opposite biophysical changes after operant and classical training, respectively. We also know to some detail the molecular mechanisms underlying some forms of learning and memory consolidation. However, it is not known whether operant and classical learning can be distinguished at the molecular level. Therefore, we investigated whether genetic manipulations could differentiate between operant and classical learning in dorsophila. We found a double dissociation of protein kinase C and adenylyl cyclase on operant and classical learning. Moreover, the two learning systems interacted hierarchically such that classical predictors were learned preferentially over operant predictors.

Do take a look at the paper and the experimental setup and lets hope that more focus on operant learning would be the focus from now on and would lead to a paradigmatic shift in molecular neuroscience with operant conditioning results more applicable to humans than classical conditioning results, in my opinion.
B BREMBS, W PLENDL (2008). Double Dissociation of PKC and AC Manipulations on Operant and Classical Learning in Drosophila Current Biology, 18 (15), 1168-1171 DOI: 10.1016/j.cub.2008.07.041

Glutamate and classical conditioning

I had speculated in one of my earlier posts that Glutamate , GABA, Glycine and aspartate may be involved in classical conditioning / avoidance learning.  To quote:

That is it for now; I hope to back up these claims, and extend this to the rest of the 3 traits too in the near future. Some things I am toying with is either classical conditioning and avoidance learning on these higher levels; or behavior remembering (as opposed to learning) at these higher levels. Also other neurotransmitter systems like gluatamete, glycine, GABA and aspartate may be active at the higher levels. Also neuro peptides too are broadly classified in five groups so they too may have some role here. Keep guessing and do contribute to the theory if you can!!

Now, I have discovered an article that links Glutamate to classical conditioning. It is titled Reward-Predictive Cues Enhance Excitatory Synaptic Strength onto Midbrain Dopamine Neurons, and here is the abstract:

Using sensory information for the prediction of future events is essential for survival. Midbrain dopamine neurons are activated by environmental cues that predict rewards, but the cellular mechanisms that underlie this phenomenon remain elusive. We used in vivo voltammetry and in vitro patch-clamp electrophysiology to show that both dopamine release to reward predictive cues and enhanced synaptic strength onto dopamine neurons develop over the course of cue-reward learning. Increased synaptic strength was not observed after stable behavioral responding. Thus, enhanced synaptic strength onto dopamine neurons may act to facilitate the transformation of neutral environmental stimuli to salient reward-predictive cues.

Though the article itself does not talk about glutamate, and nor does this Scicurious article  on Neurotopia, commenting on the same , which focuses more on the dopamine connection, still I believe that we have a Glutamate connection here. First let us see how the artifact under discussion is indeed nothing but classical conditioning:

The basic idea is that, when you get a reward unexpectedly, you get a big spike of DA to make your brain go “sweet!” After a while, you being to recognize the cues behind the reward, and so seeing the wrapper to the candy will make your DA spike in anticipation. But it’s only very recently that we’ve been able to see this change taking place, and there were still lots of questions as to what was happening when these changes happen.

So the authors of this study took a bunch of rats. They implanted fast scan cyclic voltammetry probes into their heads. Voltammetry is a technique that allows you to detect changes in DA levels in brain areas (in this case the nucleus accumbens, an area linked with reward) which represent groups of cells firing. So the rats had probes in their heads detecting their DA, and then they were given a stimulus light (a conditioned stimulus), a nosepoke device, and a sugar pellet. There is nothing that a rat likes more than a sugar pellet, and so there was a nice big spike in DA as it got its reward. So the rats figured out pretty quickly that, when the light came on, you stick your nose in the hole, and sugar was on the way. As they learned the conditioned stimulus, their DA spikes in response to reward SHIFTED, moving backward in time, so that they soon got a spike of DA when they saw the light, without a spike when they got the pellet. This means that the animals had learned to associate a conditioned stimulus with reward. Not only that, the DA spike was higher immediately after learning than the spike in rats who just got rewards without learning.

So, if we consider the dopamine spike as an Unconditioned Response, then what we have is a new CS-> CR pairing or classical conditioning taking place. Now, the crucial study that showed that the learning is mediated by Glutamate: (emphasis mine)

To find out whether or not excitatory synapses were in fact changing, they authors conducted electrophysiology experiments in rats that were either trained or not trained. Electrophysiology is a technique where you actually put a tiny, tiny electrode into a cell membrane. When that cell is then stimulated, you can actually WATCH it fire. It’s really very cool to see. Of course all sorts of things are responsible for when a cell fires and how, but what they were looking at here were specific glutamate receptors known as AMPA and NMDA. These are two major receptors that receive glutamate currents, which are excitatory and induce cells downstream to fire. What they found was that, in animals that had been trained to a conditioned stimulus, AMPA and NMDA receptors had a much stronger influence on firing than in non-trained animals, which means that the synaptic strength on DA neurons is getting stronger as animals learn. Not only that, but cells from trained rats already exhibited long-term potentiation, a phenomenon associated with formation of things like learning and memory.

But of course, you have to make sure that glutamate is really the neurotransmitter responsible, and not just a symptom of something else changing. So they ran more rats on voltammetry and trained, and this time put a glutamate antagonist into the brain. The found that a glutamate antagonist completely blocked not only the DA shift to a conditioned stimulus, but the learning itself.

From the above it is clear that Glutamate , and the LTP that it leads to in the mid-brain neurons synapses , is crucial for Classical conditioning learning. Seems that one more puzzle is solved and another jig-jaw piece fits where it should have.

Cloninger’s Temaparements and character traits: room for a behaviorist view?

Today I wish to discuss C. Robert Cloninger’s theory of temperaments and character traits. It is a psycho biological theory based on genetic and neural substrates and mechanisms and in it he proposes for the existence of four temperament traits and three character traits; thus talking about seven personality traits. First the abstract to give you some idea:

In this study, we describe a psychobiological model of the structure and development of personality that accounts for dimensions of both temperament and character. Previous research has confirmed four dimensions of temperament: novelty seeking, harm avoidance, reward dependence, and persistence, which are independently heritable, manifest early in life, and involve preconceptual biases in perceptual memory and habit formation. For the first time, we describe three dimensions of character that mature in adulthood and influence personal and social effectiveness by insight learning about self-concepts. Self-concepts vary according to the extent to which a person identifies the self as (1) an autonomous individual, (2) an integral part of humanity, and (3) an integral part of the universe as a whole. Each aspect of self-concept corresponds to one of three character dimensions called self-directedness, cooperativeness, and self-transcendence, respectively. We also describe the conceptual background and development of a self-report measure of these dimensions, the Temperament and Character Inventory. Data on 300 individuals from the general population support the reliability and structure of these seven personality dimensions. We discuss the implications for studies of information processing, inheritance, development, diagnosis, and treatment

This article provides an excellent in-depth look at the Temperament and character Inventory (TCI) developed by Cloninger and it gives detailed description of all the traits and their sub-scales or facets.

I’ll list them briefly below (in order )(along with their sub scales/ facets)

I) Novelty seeking (NS)

  1. Exploratory excitability (NS1)
  2. Impulsiveness (NS2)
  3. Extravagance (NS3)
  4. Disorderliness (NS4)

II) Harm avoidance (HA)

  1. Anticipatory worry (HA1)
  2. Fear of uncertainty (HA2)
  3. Shyness (HA3)
  4. Fatigability (HA4)

III) Reward dependence (RD)

  1. Sentimentality (RD1)
  2. Openness to warm communication (RD2)
  3. Attachment (RD3
  4. Dependence (RD4)

IV) Persistence (PS)

  1. Eagerness of effort (PS1)
  2. Work hardened (PS2)
  3. Ambitious (PS3)
  4. Perfectionist (PS4)

V) Self-directedness (SD)

  1. Responsibility (SD1)
  2. Purposeful (SD2)
  3. Resourcefulness (SD3)
  4. Self-acceptance (SD4)
  5. Enlightened second nature (SD5)

VI) Cooperativeness (C)

  1. Social acceptance (C1)
  2. Empathy (C2)
  3. Helpfulness (C3)
  4. Compassion (C4)
  5. Pure-hearted conscience (C5)

VII) Self-transcendence (ST)

  1. Self-forgetful (ST1)
  2. Transpersonal identification (ST2)
  3. Spiritual acceptance (ST3)

To me this lacks one more trait and I’m sure Cloninger will identify and add one more in the future (he added the three character traits relatively late).

Now for the meat of the post. My thesis is that these are similar to the Big Eight temperaments that I have discussed in my earlier post and follow the same eight fold developmental/evolutionary pattern. Further , I would claim that each facet of a trait follows the same structure. Most traits have 4 or 5 facets and these are typically related to 5 major ways of reacting/ relating to world around us. It is also my thesis that juts as cloninger had tied the initial three traits to behavioral inhibition, behavioral approach and behavioral maintenance and to the three neurotransmitter systems of serotonin, dopamine and norepinephrine respectively; the same line of argument can be extended to other facets and new biogenic amine CNS neurotransmitters pathways correlated with each trait.

Harm Avoidance:

Individuals high in HA tend to be cautious, careful,fearful, tense, apprehensive, nervous, timid, doubtful,discouraged, insecure, passive, negativistic, or pessimistic even in situations that do not normally worry other people. These individuals tend to be inhibited and shy in most social situations. Their energy level tends to be low and they feel chronically tired or easily fatigued. As a consequence they need more reassurance and encouragement than most people and are usually sensitive to criticism and punishment. The advantages of of high Harm Avoidance are the greater care and caution in anticipating possible danger, which leads to careful planning when danger is possible. The disadvantages occur when danger is unlikely but still anticipated, such pessimism or inhibition leads to unnecessary worry.

In contrast, individuals with low scores on this temperament dimension tend to be carefree, relaxed, daring, courageous, composed, and optimistic even in situations that worry most people. These individuals are described as outgoing, bold, and confident in most social situations. Their energy level tends to be high, and they impress others as dynamic, lively, and vigorous persons. The advantages of low Harm Avoidance are confidence in the face of danger and uncertainty,leading to optimistic and energetic efforts with little or no distress. The disadvantages are related to unresponsiveness to danger, which can lead to reckless optimism.

Form the above it is clear that this is related to Neurotcisim and. This would also be related to anxiety witnessed in clinical situations and requiring treatment. It is instructive to note that Cloninger proposes Serotonin CNS system as a substrate for this trait and that many anti-anxiety drugs actually target serotonin receptors (SSRIS are the best anti anxiety drugs).also as per the model this is involved in behavior inhibition. Let me elaborate that and propose that what is meant by behavior inhibition is learning to avoid the predator. In operant conditioning paradigms this would be learning due to Positive punishment. Learning to inhibit a pre-potent behavior because of punishments.

Novelty Seeking:

Individuals high in Novelty Seeking tend to be quick-tempered, excitable, exploratory, curious, enthusiastic, ardent, easily bored, impulsive, and disorderly The advantages of high Novelty Seeking are enthusiastic and quick engagement with whatever is new and unfamiliar, which leads to exploration of potential rewards. The disadvantages are related to excessive anger and quick disengagement whenever their wishes are frustrated, which leads to inconsistencies in relationships and instability in efforts.

In contrast, individuals low in Novelty Seeking are described as slow tempered, indifferent, uninquisitive, unenthusiastic, umemotional, reflective, thrifty, reserved, tolerant of monotony, systematic, and orderly.

These are classical Impulsiveness related symptoms and can be safely associated with the dopamine system. this trait then is related to conscientiousness and is driven by rewards and reward-related behavior learning. Excess is this trait may result in psychosis and many anti-psychotic drugs act on this dopamine system. This is the traditional behavioral activation system. In operant conditioning terms we can call this learning under positive reinforcement. New behaviors are learned or strength of old behaviors is modified (increased) in the presence of primary reinforces like food, sex,(even money) etc).

Reward dependence:

Individuals who score high in Reward Dependence tend to be tender-hearted, loving and warm, sensitive, dedicated, dependent, and sociable. They seek social contact and are open to communication with other people. Typically, they find people they like everywhere they go. A major advantage of high Reward Dependence is the sensitivity to social cues, which facilitates warm social relations and understanding of others’ feelings. A major disadvantage of high Reward Dependence involves the ease with which other people can influence the dependent person’s views and feelings, possibly leading to loss of objectivity.

Individuals low on the Reward Dependence are often described as practical, tough minded, cold, and socially insensitive. They are content to be alone and rarely initiate open communication with others. They prefer to keep their distance and typically have difficulties in finding something in common with other people. An advantage of low Reward Dependence is that independence from sentimental considerations.

From the above it is clear that this is related to trait Extraversion or sociability and influences how adept, and prone, one is at forming alliances and friends. This has been hypothesized to be related to norepinephrine system and related to behavioral maintenance. In operant conditioning terms , I interpret it as maintaining a behavior despite no real (primary) reinforcement, but just because of secondary reinforcement (social approval, praise, status etc). This is not necessary maladaptive and secondary reinforcement are necessary; but too much dependence on that may lead to depression. Initial anti-depressants all worked on the norepinepherine system and the monoamine theory of depression is still around. I believe that depression is multi-factorial, but the social striving/approval/negotiation is a prime facet underlying the illness.


Individuals high in Persistence tend to be industrious, hard-working, persistent, and stable despite frustration and fatigue. They typically intensify their effort in response to anticipated reward. They are ready to volunteer when there is something to be done, and are eager to start work on any assigned duty. Persistent persons tend to perceive frustration and fatigue as a personal challenge. They do not give up easily and, in fact, tend to work extra hard when criticized or confronted with mistakes in their work. Highly persistent persons tend to be ambitious overachievers
who are willing to make major sacrifices to be a success. A highly persistent individual may tend to be a perfectionist and a workaholic who pushes him/herself far beyond what is necessary to get by.High Persistence is an adaptive behavioral strategy when rewards are intermittent but the contingencies remain stable. However, when the contingencies change rapidly, perseveration becomes maladaptive.

When reward contingencies are stable, individuals low in Persistence are viewed as indolent, inactive, unreliable, unstable and erratic on the basis of both self-reports and interviewer ratings. They rarely intensify their effort even in response to anticipated reward. These persons rarely volunteer for anything they do not have to do, and typically go slow in starting work, even if it is easy to do. They tend to give up easily when faced with frustration, criticism, obstacles, and fatigue. These persons are usually satisfied with their current accomplishments, rarely strive for bigger and better things, and are frequently described as underachievers who could probably accomplish for than they actually do, but do not push themselves harder than it is necessary to get by. Low scorers manifest a low level of perseverance and repetitive behaviors even in response to intermittent reward. Low Persistence is an adaptive strategy when reward contingencies change rapidly and may be maladaptive when rewards are infrequent but occur in the long run.

By some stretch of imagination one can relate this to being empathetic or agreeable. (volunteering etc) and thus to agreeableness. One way this could be related to parental investment is that those who do not care for their kids have children that give up easily and are frustrated easily; thus the same mechanism may lie both parental care behavior and persistent behavior in the kid. This behavior/trait I propose may be related to epinepherine CNS system. This is related to behavior persistence; in opernat conditioning terms this is behavior persistence despite no primary or even secondary reinforcement. Of course extinction will eventually happen in absence of reward, but factors like time/ no. of trials taken to archive extinction may be a factor here. Although, behavior is not reinforced at all still it is persisted with and maybe even different related variations tried to get the desired reward. Stimulants as a class of drug may be acting on this pathway, stimulating individuals to engage in behavior despite no reinforcement.


Highly self-directed persons are described as mature, strong, self-sufficient, responsible, reliable, goaloriented, constructive, and well-integrated individuals when they have the opportunity for personal leadership. They have good self-esteem and self-reliance. The most distinctive characteristics of self-directed individuals is that they are effective, able to adapt their behavior in accord with individualy chosen, voluntary goals. When a self-directed individual is required to follow the orders of others in authority, they may be viewed as rebellious trouble maker because they challenge the goals and values of those in authority.

In contrast, individuals who are low in Self-Directedness are described as immature, weak, fragile, blaming, destructive, ineffective, irresponsible, unreliable, and poorly integrated when they are not conforming to the direction of a mature leader. They are frequently described by clinicians as immature or having a personality disorder. They seem to be lacking an internal organizational principle, which renders them unable to define, set, and pursue meaningful goals. Instead, they experience numerous minor, short term, frequently mutually exclusive motives, none of which can develop to the point of long lasting personal significance and realization.

To me the above looks very much like the Rebelliousness/ conformity facet of Openness or intellect. The core idea being whether one has archived ego-integrity and good habits. I propose that histamine or melatonin may be the mono amine CNS system involved here, though phenuylethylamine(PEA) also seems a good target, so do tyrosine and other trace amines. Whatever be the neurotransmitter system involved, the operant conditioning phenomenon would be learning to engage in behavior despite +ve punishment. thus, the ability to go against the grain, convention, or social expectations and be true to oneself. This behavior can be called learning under -ve reinforcement i.e engaging in a behavior despite there being troubling things around, in the hope that they would be taken away on successful new behavior. I would also relate this to behavioral reportaire of the individual. People high on this trait would show greater behavioral variability during extinction trials and come up with novel and insightful problem solving behaviors.

That is it for now; I hope to back up these claims, and extend this to the rest of the 3 traits too in the near future. Some things I am toying with is either classical conditioning and avoidance learning on these higher levels; or behavior remembering (as opposed to learning) at these higher levels. Also other neurotransmitter systems like gluatamete, glycine, GABA and aspartate may be active at the higher levels. Also neuro peptides too are broadly classified in five groups so they too may have some role here. Keep guessing and do contribute to the theory if you can!!

Memory formed more easily in daytime

As per a new Nature Reviews Neuroscience research highlight , conditioning in zebrafish happened better during subjective daytime (SD) as compared to subjective nighttime (SN) and this effect was mediated by the release of Melatonin during nighttime. the authors conclude that Melatonin suppresses memory formation in Zebrafish.

Learning and memory are known to be influenced by the time of day, but the nature and mechanism of this modulation has been elusive. Now, a new study shows that melatonin, a hormone released in a circadian fashion, affects memory consolidation in zebrafish.

Melatonin release peaks during the night and falls during the day, and melatonin has been shown to affect neuronal firing in the hippocampus. The authors therefore decided to investigate whether melatonin mediates the effects of the circadian system on memory formation. They found that bathing the zebrafish in 50 muM melatonin prior to SD conditioning significantly suppressed memory formation, whereas administration after conditioning or prior to testing had no effect. Furthermore, administration of a melatonin-receptor antagonist prior to SN conditioning significantly improved memory retention, as did removal of the pineal gland, the site of melatonin release.

Taken together, these results show that memory formation in zebrafish is inhibited during the night relative to the day, and that this modulation is mediated at least in part by circadian melatonin release. This might direct future research into improving mental performance in humans.

While extending the research results from zebrafish to humans may be premature, some simple studies with human subjects can confirm the effect of melatonin on human learning. and memory formation.

Inability to learn from mistakes: The dopamine effect

In one of my recent posts on the role of basal ganglia in reinforcement learning, I highlighted research that showed that having high dopamine levels in basal ganglia causes an inability to learn form negative reinforcements, because of the low activity of the proposed NoGo system. Conversely a low dopamine level is associated with more effective negative reinforcement learning. To quote form the earlier article:

We found a striking effect of the different dopamine medications on this positive versus negative learning bias, consistent with predictions from our computer model of the learning process. While on placebo, participants performed equally well at choose-A and avoid-B test choices. But when their dopamine levels were increased, they were more successful at choosing the most positive symbol A and less successful at avoiding B. Conversely, lowered dopamine levels were associated with the opposite pattern: worse choose-A performance but more-reliable avoid-B choices. Thus the dopamine medications caused participants to learn more or less from positive versus negative outcomes of their decisions.

A similar result has been obtained for those who have an A1 allele for dopamine receptor D2. This allele causes fewer dopamine D2 receptor density, though as the paper is behind a subscription firewall, I could not ascertain if the decreased density is both in the Go as well as NoGo pathway. Anyway, it might be hypothesized that to compensate for the low receptor density more dopamine needs to be produced. this paradoxically leads to the situation where there is more baseline dopamine in the basal ganglia Go and NoGO pathways, thus leading to easy excitation of Go Pathway , but to lesser inhibition due to NoGo pathway. As a result, those who have A1 Allele would end up being unable to learn from negative reinforcement, and this is exactly what the authors have found. Here is the abstract of the paper:

The role of dopamine in monitoring negative action outcomes and feedback-based learning was tested in a neuroimaging study in humans grouped according to the dopamine D2 receptor gene polymorphism DRD2-TAQ-IA. In a probabilistic learning task, A1-allele carriers with reduced dopamine D2 receptor densities learned to avoid actions with negative consequences less efficiently. Their posterior medial frontal cortex (pMFC), involved in feedback monitoring, responded less to negative feedback than others’ did. Dynamically changing interactions between pMFC and hippocampus found to underlie feedback-based learning were reduced in A1-allele carriers. This demonstrates that learning from errors requires dopaminergic signaling. Dopamine D2 receptor reduction seems to decrease sensitivity to negative action consequences, which may explain an increased risk of developing addictive behaviors in A1-allele carriers.

I came to know this via the Action Potential blog. Though the AP blog is dismissive of this study, it provided a much more detailed description of the actual study.

Staying on the genetics theme, a recent Science article suggests that a particular variant of the dopamine receptor (D2) causes some people to poorly learn via negative reinforcement. The A1 allele, as this variant is known, has previously been linked to increased vulnerability of addiction.

The researchers recruited volunteers, who performed a learning task while lying in an fMRI machine. Individuals with the A1 allele (at least one copy) were equally successful at selecting a targeted “good” symbol reinforced with positive feedback (the display of a “smiley face”) as those individuals completely lacking the A1 allele. However, when the task was changed such that negative reinforcement drove the learning (subjects were asked to avoid the “bad symbol”), those individuals with the A1 allele failed to perform as well as their A1-lacking colleagues.

Examining the fMRI data, those with the A1 allele had less activity in the frontal cortex and hippocampus, two areas normally responsive during tasks involving negative reinforcement and memory. This reduction was thought to be because possessing the A1 allele can cause up to a 30% reduction in D2 receptor density in individuals, presumably affecting the neural circuitry, and likely influencing the activity within the reward signaling pathways.

The AP blog may dismiss this on methodological grounds, and I because I have not read the original paper , would not comment much on that; but if I have understood the study method correctly, it seems to be the same as described in my earlier post on basal ganglia and seems a valid study paradigm. I am excited by this research and this definitely adds to our understanding of the dopamine pathways involved.

Baboon Metaphysics: Tabula Rasa and Group IQ

I recently came across this free excerpt from the Baboon Metaphysics: The Evolution of a Social Mind. From the excerpt the book seems very promising.

First, let me tell you how, the book got its name. It got its name from a quote by Darwin, while he was contemplating the debate between empiricists (we gain knowledge from experiences- tabula rasa) and rationalists (we have innate schema, intuition and logic that is independent of experiences) as to how we acquire knowledge, and how evolutionary theory might provide the answers.

With growing excitement, Darwin began to see that his theory might allow him to reconstruct the evolution of the human mind and thereby resolve the great debate between rationalism and empiricism. The modern human mind must acquire information, organize it, and generate behavior in ways that have been shaped by our evolutionary past. Our metaphysics must be the product of evolution. And just as the key to reconstructing the evolution of a whale’s fin or a bird’s beak comes from comparative research on similar traits in closely related species, the key to reconstructing the evolution of the human mind must come from comparative research on the minds of our closest animal relatives. “He who understands baboon would do more towards metaphysics than Locke.”

The authors then go on to confront behaviorist thoughts with experimental results that show that many animals come pre-programed in this world.

Song sparrows (Melospiza melodia) and swamp sparrows (Melospiza georgiana) are two closely related North American birds with very different songs. Males in both species learn their songs as fledglings, by listening to the songs of other males. But this does not mean that the mind of a nestling sparrow is a blank slate, ready to learn virtually anything that is written upon it by experience. In fact, as classic research by Peter Marler and his colleagues has shown, quite the opposite is true. If a nestling male song sparrow and a nestling male swamp sparrow are raised side-by-side in a laboratory where they hear tape-recordings of both species’ songs, each bird will grow up to sing only the song of its own species.

The constraints that channel singing in one direction rather than another cannot be explained by differences in experience, because each bird has heard both songs. Nor can the results be due to differences in singing ability, because both species are perfectly capable of producing each other’s notes. Instead, differences in song learning must be the result of differences in the birds’ brains: something in the brain of a nestling sparrow prompts it to learn its own species’ song rather than another’s. The brains of different species are therefore not alike. And the mind of a nestling sparrow does not come into the world a tabula rasa—it arrives, instead, with genetically determined, inborn biases that actively organize how it perceives the world, giving much greater weight to some stimuli than to others. One can persuade a song sparrow to sing swamp sparrow notes, but only by embedding these notes into a song sparrow’s song. It is almost impossible to persuade a swamp sparrow to sing any notes other than its own. Philosophically speaking, sparrows are Kantian rationalists, actively organizing their behavior on the basis of innate, preexisting schemes.

They then go on to discuss studies by Tolman and his students that gave a blow to behaviorism and introducing knowledge as an intermediary between stimulus and response.

In 1928, Otto L. Tinklepaugh, a graduate student of Tolman’s, began a study of learning in monkeys. His subjects were several macaques who were tested in a room in the psychology department at the University of California at Berkeley (sometimes the tests were held outdoors, on the building’s roof, which the monkeys much preferred). In one of Tinklepaugh’s most famous experiments, a monkey sat in a chair and watched as a piece of food—either lettuce or banana—was hidden under one of two cups that had been placed on the floor, six feet apart and several feet away. The other cup remained empty. Once the food had been placed under the cup, the monkey was removed from the room for several minutes. Upon his return, he was released from the chair and allowed to choose one of the cups. All of Tinklepaugh’s subjects chose the cup hiding the food, though they performed the task with much more enthusiasm when the cup concealed banana.

To illustrate the difference between behaviorist and cognitive theories of learning, pause for a moment to consider the monkey as he waits outside the experimental room after seeing, for example, lettuce placed under the left-hand cup. What has he learned? Most of us would be inclined to say that he has learned that there is lettuce under the left-hand cup. But this was not the behaviorists’ explanation. For behaviorists, the reward was not part of the content of learning. Instead, it served simply to reinforce or strengthen the link between a stimulus (the sight of the cup) and a response (looking under). The monkey, behaviorists would say, has learned nothing about the hidden food—whether it is lettuce or banana. His knowledge has no content. Instead, the monkey has learned only the stimulus-response associations, “When you’re in the room, approach the cup you last looked at” and “When you see the cup, lift it up.” Most biologists and laypeople, by contrast, would adopt a more cognitive interpretation: the monkey has learned that the right-hand cup is empty but there is lettuce under the left-hand cup.

To test between these explanations, Tinklepaugh first conducted trials in which the monkey saw lettuce hidden and found lettuce on his return. Here is his summary of the monkey’s behavior:

Subject rushes to proper cup and picks it up. Seizes lettuce. Rushes away with lettuce in mouth, paying no attention to other cup or to setting. Time, 3–4 seconds.

Tinklepaugh next conducted trials in which the monkey saw banana hidden under the cup. Now, however, Tinklepaugh replaced the banana with lettuce while the monkey was out of the room. His observations:

Subject rushes to proper cup and picks it up. Extends hand toward lettuce. Stops. Looks around on floor. Looks in, under, around cup. Glances at other cup. Looks back at screen. Looks under and around self. Looks and shrieks at any observer present. Walks away, leaving lettuce untouched on floor. Time, 10–33 seconds.

It is impossible to escape the impression that the duped monkey had acquired knowledge, and that as he reached for the cup he had an expectation or belief about what he would find underneath. His shriek reflected his outrage at this egregious betrayal of expectation.

Later they move on to their central premise, that baboons offer a good model to study the evolution of (human) mind.

Moreover, the conservation status of baboons confers neither glamour nor prestige on those who study them. Far from being endangered, baboons are one of Africa’s most successful species. They flourish throughout the continent, occupying every ecological niche except the Sahara and tropical rain forests. They are quick to exploit campsites and farms and are widely regarded as aggressive, destructive, crop-raiding hooligans. Finally, baboons are not particularly good-looking—many other monkeys are far more photogenic. Indeed, through the ages baboons have evoked as much (if not more) repulsion than admiration.

Baboons are interesting, however, from a social perspective. Their groups number up to 100 individuals and are therefore considerably larger than most chimpanzee communities. Each animal maintains a complex network of social relationships with relatives and nonrelatives—relationships that are simultaneously cooperative and competitive. Navigating through this network would seem to require sophisticated social knowledge and skills. Moreover, the challenges that baboons confront are not just social but also ecological. Food must be found and defended, predators evaded and sometimes attacked. Studies of baboons in the wild, therefore, allow us to examine how an individual’s behavior affects her survival and reproduction. They also allow us to study social cognition in the absence of human training, in the social and ecological contexts in which it evolved.

This same theme, of baboons having a greater social/group IQ is also touched upon by fellow ScientificBlogger Howard Bloom in a series of fascinating articles at the, where I also blog. specifically Bloom refers to baboons and how they are smarter than chimpanzees, by being able to adapt to any environment (a more plausible definition of intelligence, instead of the usual anthropomorphic one we are accustomed to).

The ultimate test of intelligence is adaptability—how swiftly you can solve a complex problem, whether that problem is couched in words, in images, in crises, or in everyday life. The arena where intelligence is most important is not the testing room, it’s the real world. When you measure adaptability by the ability to turn disasters into opportunities and wastelands into paradises, bacteria score astonishingly high. But how do big-brained chimpanzees and small-brained baboons do? Or, to put it differently, how adaptable, clever, mentally agile, and able to solve real-world problems have chimpanzees and baboons proven to be?

He illustrates the above with a real world field study case example that showed the high adaptability of baboons.

Baboons have been called “the rats of Africa.” No matter how badly you desecrate their environment, they find a way to take advantage of your outrage. One group, the Pumphouse Gang, was under study for years by primatologist Shirley Strum. When Strum began her baboon-watching, the Pumphouse Gang lived off the land in Kenya and ate a healthy, all-natural diet. They ate blossoms and fruits when those were in season. When there were no sweets and flowery treats, the baboons dug up roots and bulbs.

Then came disaster—the meddling of man. Farmers took over parts of the baboons’ territory, plowed it, built houses, and put up electrified fences around their crops. Worse, the Kenyan military erected a base, put up homes for the officers’ wives and kids, and trashed even more of the baboons’ territory by setting aside former baboon-land for a giant garbage heap. If this had happened to a patch of forest inhabited by chimps, the chimpanzee tribes would have been devastated. But not the baboons.

At first, the Pumphouse Gang maintained its old lifestyle and continued grubbing in the earth for its food. Then came a new generation of adolescents. Each generation of adolescent baboons produces a few curious, unconventional rebels. Normally a baboon trip splits up In small groups and goes off early in the day to find food. But one of the adolescent non-conformists of the Pump House Gang insisted on wandering by himself. His roaming took him to the military garbage dump. The baboon grasped a principle that chimps don’t seem to get. One man’s garbage is another primate’s gold. One man’s slush is another animal’s snow cone.

The baboon rebel found a way through the military garbage heap’s barbed wire fence, set foot in the trash heap, and tasted the throwaways. Pay dirt. He’d hit a concentrated source of nutrition. When they came back to their home base at the end of the day, the natural-living baboons, the ones who had stuck to their traditional food-gathering strategies, to their daily grind digging up tubers, came home dusty and bedraggled, worn out by their work. But the adolescent who invented garbage raiding came back energetic, rested, strong, and glorious. As the weeks and months went by, he seemed to grow in health and vigor. Other young adolescent males became curious. Some followed the non-conformist on his daily stroll into the unknown. And, lo, they too discovered the garbage dump and found it good.

Eventually, the males who made the garbage dump their new food source began to sleep in their own group, separated from the conservative old timers. As they grew in physical strength and robustness, these Young Turks challenged the old males to fights. The youngsters’ food was superior and so was their physical power. They had a tendency to win their battles. Females attracted by this power wandered outside the ancestral troop and spent increasing amounts of time with the rebel males—who continued to increase their supply of high-quality food by inventing ways to open the door latches of the houses of the officers’ wives and taught themselves how to open kitchen cupboards and pantries and who also Invented ways to make their way through the electrified fences of farmers and gather armloads of corn. The health of the males and females in the garbage-picking group was so much better than that of the old troop that a female impregnated in the gang of garbage-pickers and farm-raiders was able to have a new infant every eighteen months. The females in the old, conservative, natural-diet group were stuck with a new infant only every 24 months. The innovators were not only humiliating the conservatives in pitch battles, they were outbreeding them.

I find the above anecdote very appealing. It seems we got to learn a lot from the social species and baboons may just be the ones we should look at more closely.

Basal Ganglia: action selection, error prediction and reinforcement learning

The December edition of Dana Foundation’s online brain journal , cerebrum , has a very informative and interesting piece on the role of basal ganglia in response selection, error prediction and reinforcement learning.

The article contains a primer on basic basal ganglia functions and pathways.

The basal ganglia are a collection of interconnected areas deep below the cerebral cortex. They receive information from the frontal cortex about behavior that is being planned for a particular situation. In turn, the basal ganglia affect activity in the frontal cortex through a series of neural projections that ultimately go back up to the same cortical areas from which they received the initial input. This circuit enables the basal ganglia to transform and amplify the pattern of neural firing in the frontal cortex that is associated with adaptive, or appropriate, behaviors, while suppressing those that are less adaptive. The neurotransmitter dopamine plays a critical role in the basal ganglia in determining, as a result of experience, which plans are adaptive and which are not.

Evidence from several lines of research supports this understanding of the role of basal ganglia and dopamine as major players in learning and selecting adaptive behaviors. In rats, the more a behavior is ingrained, the more its neural representations in the basal ganglia are strengthened and honed. Rats depleted of basal ganglia dopamine show profound deficits in acquiring new behaviors that lead to a reward. Experiments pioneered by Wolfram Schultz, M.D., Ph.D., at the University of Cambridge have shown that dopamine neurons fire in bursts when a monkey receives an unexpected juice reward. Conversely, when an expected reward is not delivered, these dopamine cells actually cease firing altogether, that is, their firing rates “dip” below what is normal. These dopamine bursts and dips are thought to drive changes in the strength of synaptic connections—the neural mechanism for learning—in the basal ganglia so that actions are reinforced (in the case of dopamine bursts) or punished (in the case of dopamine dips)

In particular it discusses the role of dopaminergic receptors in the GO and NoGO pathways that are involved in positive and negative reinforcement learning respectively.

Building on a large body of earlier theoretical work, my colleagues and I developed a series of computational models that explore the role of the basal ganglia when people select motor and cognitive actions. We have been focusing on how When the “Go” pathway is active, it facilitates an action directed by the frontal cortex, such as touching your pinkies together. But when the opposing “NoGo” pathway is more active, the action is suppressed. dopamine signals in the basal ganglia, which occur as a result of positive and negative outcomes of decisions (that is, rewards and punishments), drive learning. This learning is made possible by two main types of dopamine receptors, D1 and D2, which are associated with two separate neural pathways through the basal ganglia. When the “Go” pathway is active, it facilitates an action directed by the frontal cortex, such as touching your pinkies together. But when the opposing “NoGo” pathway is more active, the action is suppressed. These Go and NoGo pathways compete with each other when the brain selects among multiple possible actions, so that an adaptive action can be facilitated while at the same time competing actions are suppressed. This functionality can allow you to touch your pinkies together, not perform another potential action (such as scratching an itch on your neck), or to concentrate on a math problem instead of daydreaming.

But how does the Go/NoGo system know which action is most adaptive? One answer, we think (and as you might have guessed), is dopamine. During unexpected rewards, dopamine bursts drive increased activity and changes in synaptic plasticity (learning) in the Go pathway. When a given action is rewarded in a particular environmental context, the associated Go neurons learn to become more active the next time that same context is encountered. This process depends on the D1 dopamine receptor, which is highly concentrated in the Go pathway. Conversely, when desired rewards are not received, the resulting dips in dopamine support increases in synaptic plasticity in the NoGo pathway (a process that depends on dopamine D2 receptors concentrated in that pathway). Consequently, these nonrewarding actions will be more likely to be suppressed in the future.

It then goes on to consider the different types of learner: positive learners that have a more active GO system and negative learners that have a more active NoGO system.

This theoretical framework, which integrates anatomical, physiological, and psychological data into a single coherent model, can go a long way in explaining changes in learning, memory, and decision making as a function of changes in basal ganglia dopamine. In particular, this model makes a key, previously untested, prediction that greater amounts of dopamine (via D1 receptors) support learning from positive feedback, whereas decreases in dopamine (via D2 receptors) support learning from negative feedback.

They then experimentally manipulated the dopamine levels and verified their predictions. The experiment involved a simple game in which two symbols say A and B were paired consistently (along with other symbols say ‘CD’) with subjects required to choose one of them. After each choosing, the subject was given feedback as to whether he had been rewarded or punished. This feedback was not consistently related to the choice : ‘A’ was rewarded with positive feedback 80% of times, while ‘B’ was punished with negative feedback 80 % of the times. Thus though an implicit learning would happen to chose A and reject B, this rule would not be explicitly learned. Now, comes the interesting part, choose A strategy is related to positive learning and Avoid B strategy with negative learning. When these symbols A and B, in test phase were paired with new symbols say E and F respectively, subjects should have implicitly still gone with choose A and Avoid B strategy with equal inclination. Yet, administering dopamine affecting drugs had dramatic effects.

We found a striking effect of the different dopamine medications on this positive versus negative learning bias, consistent with predictions from our computer model of the learning process. While on placebo, participants performed equally well at choose-A and avoid-B test choices. But when their dopamine levels were increased, they were more successful at choosing the most positive symbol A and less successful at avoiding B. Conversely, lowered dopamine levels were associated with the opposite pattern: worse choose-A performance but more-reliable avoid-B choices. Thus the dopamine medications caused participants to learn more or less from positive versus negative outcomes of their decisions

They then go on to apply these results to Parkinson’s patients.In Parkinson’s patients have deficits in basal ganglia dopamine levels – especially in the NoGO pathway. Medication is L-Dopa which is a dopamine precursor and acts by increasing dopamine in the basal ganglia. They hypothesized, that people with untreated Parkinson’s will be negative learners (less dopamine and less the dip), while those on medication would be positive learners 9more dopamine and more the burst).

To test this idea, we presented people with Parkinson’s disease with the same choose-A/avoid-B learning task once while they were on their regular dose of dopamine medication and another time while off it.8 Consistent with what we predicted, we found that, indeed, patients who were off the medication were relatively impaired at learning to choose the most positive stimulus A, but showed intact or even enhanced learning of avoid-B. Dopamine medication reversed this bias, improving choose-A performance but impairing avoid-B. This discovery supports the idea that medication prevents dopamine dips during negative feedback and impairs learning based on negative feedback

This notion might explain why some medicated Parkinson’s patients develop pathological gambling behaviors, which could result from enhanced learning from gains together with an inability to learn from losses.

The above (gambling in those on dopamine) I have touched earlier also, in relation to psychosis and schizophrenia, where dopamine excess is suspected. In those cases, having a consistently high dopamine level may predispose towards positive behavioral learning and positive cognitive learning. The latter may be the underlying manic loop, whereby only positively rewarded cognitions become salient, leading to a rosy picture of universe. Negatively reinforced cognitions are not registered properly and not learned/ remembered.

They then go on to discuss other implications like in ADHD, wherein the total noise in dopamine neurons may be higher, leading to both lowered positive and negative learning (my conjecture, not author’s) and in addiction.

Overall a very fascinating article indeed.