I have recently blogged a bit about action-selection and operant learning, emphasizing that the action one chooses, out of many possible, is driven by maximizing the utility function associated with the set of possible actions, so perhaps a quick read of last few posts would help appreciate where I come from .
To recap, whenever an organism makes a decision to indulge in an act (an operant behavior), there are many possible actions from which it has to choose the most appropriate one. Each action leads to a possibly different Outcome and the organism may value the outcomes differentially. this valuation may be both objective (how the organism actually ‘likes’ the outcome once it happens, or it may be subjective and based on how keenly the organism ‘wants’ the outcome to happen independent on whether the outcome is pleasurable or not. Also, it is never guaranteed that the action would produce the desired/expected outcome. There is always some probability associated that the act may or may not result in the expected outcome. Also, on a macro level the organism may lack sufficient energy required to indulge in the act or to carry it out successfully to completion. Mathematically, with each action one can associate a utility U= E x V (where U is utility of act; E is expectancy as to whether one would be able to carry the act and if so whether the act would result in desired outcome; and V is the Value (both subjective and objective0 that one has assigned to the outcome. The problem of action-selection then is simply to maximize the utility given different acts n and to choose the action with maximum utility.
Today I had an epiphany; doesn’t the same logic apply to allocating attention to the various stimuli that bombard us. Assuming a spotlight view of attention, and assuming that there are limited attentional resources, one is constantly faced with the problem of finding which stimuli in the world are salient and need to be attended to. Now, the leap I am making is that attention-allocation just like choosing to act volitionally is an operant and not a reactive, but pro-active process. It may be unconscious, but still it involves volition and ‘choosing’. Remember, that even acts can be reactive and thus there is room for reactive attention; but what I am proposing is that the majority of attention is pro-active- actively choosing between stimuli and focusing on one to try and better predict the world. We are basically prediction machines that want to predict beforehand the state of the world that is most relevant to us and this we do by classical or pavlovian conditioning. We try to associate stimuli (CS) with stimuli(UCS) or response (UCR) and thus try to ascertain what state of world at time T would be given that stimulus (CS) has happened. Apart from prediction machines we are also Agents that try to maximize rewards and minimize punishments by acting on this knowledge and acting and interacting with the world. There are thousands of actions we can indulge in- but we choose wisely; there are thousands of stimuli in the external world, but we attend to salient features wisely.
Let me elaborate on the analogy. While selecting an action we maximize reward and minimize punishment, basically we choose the maximal utility function; while choosing which stimuli to attend to we maximize our foreknowledge of the world and minimize surprises, basically we choose the maximal predictability function; we can even write an equivalent mathematical formula: Predictability P = E x R where P is the increase in predictability due to attending to stimulus 1 ; E is probability that stimulus 1 correctly leads to prediction of stimulus 2; and R is the Relevance of stimulus 2(information) to us. Thus the stimulus one would attend, is the one that leads to maximum gain in predictability. Also, similar to the general energy level of organism that would bias as to whether, and how much, the organism acts or not; there is a general arousal level of the organism that biases whether and how much it would attend to stimuli.
So, what new insights do we gain from this formulation? First insight we may gain is by elaborating the analogy further. We know that basal ganglia in particular and dopamine in general is involved in action-selection. Dopamine is also heavily involved in operant learning. We can predict that dopamine systems , and the same underlying mechanisms, may also be used for attention-allocation. Dopamine may also be heavily involved in classical learning as well. Moreover, the basic computations and circuitry involved in allocating attention should be similar to the one involved in action-selection. Both disciplines can learn from each other and utilize methods developed in one field for understanding and elaborating phenomenon in the other filed. For eg; we know that dopamine while coding for reward-error/ incentive salience also codes for novelty and is heavily involved in novelty detection. Is the novelty detection driven by the need to avoid surprises, especially while allocating attention to a novel stimulus.
What are some of the prediction we can make form this model: just like the abundant literature on U= E x V in decision making and action selection literature, we should be able to show the independent and interacting effects of Expectancy and Relevance on attention-grabbing properties of stimulus. The relevance of different stimuli can be manipulated by pairing them with UCR/UCS that has different degrees of relevance. The expectancy can be differentially manipulated by the strength of conditioning; more trials would mean that the association between the CS and UCS is strong; also the level of arousal may bias the ability to attend to stimuli. I am sure that there is much to learn in attention research from the research on decision-making and action-selection and the reverse would also be true. It may even be that attention-allocation is actually conceptualized in the above terms; if so I plead ignorance of knowledge of this sub-field and would love to get a few pointers so that I can refine my thinking and framework.
Also consider the fact that there is already some literature implicating dopamine in attention and the fact that dopamine dysfunction in schizophrenia, ADHD etc has cognitive and attentional implications is an indication in itself. Also, the contextual salience of drug-related cues may be a powerful effect of dapomine based classical conditioning and attention allocation hijacking the normal dopamine pathways in addicted individuals.
Lastly, I got set on this direction while reading an article on chaining of actions to get desired outcomes and how two different brain systems ( a cognitive (Prefrontal) high road one based on model-based reinforcement learning and a unconscious low road one (dorsolateral striatal) based on model-free reinforcement learning)may be involved in deciding which action to choose and select. I believe that the same conundrum would present itself when one turns attention to the attention allocation problem, where stimuli are chained together and predict each other in succession); I would predict that there would be two roads involved here too! but that is matter for a future post. for now, would love some honest feedback on what value, if any, this new conceptualization adds to what we already know about attention allocation.