Archive for December, 2006
The evolutionary trajectory of color vision
In a recent comment, a reader of this blog had highlighted some concerns regarding an evolutionary trajectory of color vision evolution that I had proposed in conjunction with the evolution of color terms in human languages and as a possible explanation for the linguistic trend. At that time, I had proposed a possible evolutionary scenario, without doing due diligence investigation of existing evolutionary theories of color vision, as my post had more of a linguistic and developmental focus and the evolutionary conjecture was just that- a conjecture, which, if found true, would lend more credence to my linguistic trend. Thanks to Andreas, I reviewed the literature on color vision evolution and was surprised to find some support for my theorization.
Before I discuss the color vision evolution, I’ll strongly recommended reading two posts on the evolution of color vision and the evolution of retinal structures (more Avian focus here) , for getting some basic familiarities with the retinal structures involved in color vision and how they might have evolved.
To recap,
An animal has color vision if it has the capability of discriminating lights (scattered light as well as light sources) on the basis of the lights’ spectral content, even when those lights are of equal subjective brightness.
The front end requirement for such a system is that the animal must have at least two different spectral classes of receptor, where each class is defined by the sensitivity of the receptor to light as a function of wavelength.

The above succinctly defines what we usually mean by color vision. You can either have a dichromatic color vision, when you have two differently tuned receptors to detect different light wavelengths and the different signal combinations from these receptors yield different hues; or you can have trichroimatic / tetrachromatic vision where three/four independent color signals are combined to yield an entire Hue range. One familiar with the RGB color system used in computers, would note that it is based on the assumption of 3 pure colors, which can be mixed in different amounts to yield most of the color hues we see on the monitor.Pigeons, and birds in general, have a tetrachromatic color vision.
Now for some basic visual circuitry:
The retinal structures involved in vision, in mammals, are, pohotorecptors (classified as cones and rods), horizontal, bipolar, amaracine and ganglion cells.
However, for all vertebrates (mammals as well as reptiles and birds) and invertebrates as well, the receptor mechanism is conserved and is basically the same and we will discuss that first:
The first step in the transduction of light energy to a neural signal is the light-induced isomerization (change of shape) of a chromophore, specifically a vitamin A derivative. Each chromophore is bound to a membrane protein called an opsin. The main function of the opsin is to change shape after light absorption triggers the isomerization of the chromophore: the opsin is an enzyme that is activated by the chromophore’s isomerization. However, because of the linkage between the opsin and the chromophore, the opsin also serves to tune the wavelength dependence of the light induced isomerization reaction in the chromophore. That is, the chromophore’s sensitivity to light at a given wavelength is established in part by the opsin–different opsins (i.e. opsins with different amino acid sequences) bound to identical chromophores will have different absorption probabilities at each wavelength. The result is that photoreceptors which express the gene for only one type of opsin will form a different class than photoreceptors that express a gene coding for a different opsin. Although there are other mechanisms that animals could use to differentiate photoreceptor classes (most notably some animals use more than one chromophore, and many vertebrates have colored oil droplets that screen individual receptors) it seems that the expression of only one of their possible opsin coding genes in each receptor is the mechanism that all animals use.

The above clarifies, that in mammals, we associate color vision with cones or specialized photoreceptors that contain a single pigment and are responsive to a single wavelength range. In reptiles, we also have double cones, wherein, two photopigment/ receptors are part of the same cell and then there are other mechanism like oil droplets that are also involved in color vision (but thankfully not in mammals). Rods are also a type of receptors, tuned to a frequency, but we normally do not associate rods with color vision, because they are usually used for night vision and their signals are not combined to create the color hue; yet a limited form of monochromatic color vision is possible by having a combination of one rod and one cone receptor types.
Next we need to differentiate between the rhabodermic eyes of invertebrates (based on r-opsin and the ciliary eyes of vertebrates based on c-opsins. Pharyngula does an excellent job here.
Eyes can be further categorized as rhabdomeric or ciliary by the nature of the cellular elements that make up the photoreceptors, by the kind of opsin molecule used to transduce the light signal, and by the signaling pathway used to convert a conformation change of the opsin molecule into a change in the electrical potential across the cell membrane.
As many accounts of color vision evolution focus on the phylogentic tress of opsin genes evolution to make their case, it is important to distinguish between the levels of analysis. All the known Opsin genes can be classifies in seven sub-families: two of these the r-opsin families and the c-opsin families are pertinent to, and expressed in, the photorecptors found in invertebrates and vertebrates respectively.

Thus, if one wants to focus on mammal color vision evolution, one needs to focus on c-opsins mostly. Many studies have been conducted over these and the phylogentic data indicates that the vertebrate opsins too form a neat tree with five sub-families relevant for (color) vision and 3 other sub-families having non-visual functions.
Thus, in mammals we have a five types of opsins : one rhodopsin-type and expressed in rods, and four other chromatic types (detecting Red, Blue, Green and U/V colors) and expressed in cones.
One should pause here and note that the human S(short) or blue receptor actually belongs to the U/V (S) family; while the human L (red) and M (green) receptors both belong to the Red (L) family.
These 5 opsin families (Red, Gree, Blue, U/V and Rhodipsin)have been variously characterized as (L, ML, MS, S and Rh) or as(RH1, RH2, LWS, SWS1 and SWS2).
With this background, information, we can now go straight to the heart of the problem: the evolutionary trajectory of these different receptors / opsins and how the color vision evolved in humans. I’ll limit the discussion here to mammals first and then to primates , as my original thesis that color terms evolution follows the color vision evolution requires the analysis to happen only in that time frame in which linguistic abilities make sense. Assuming some proto-language in Apes and primates, it is reasonable to expect that whatever sequence of color terms we see in languages, would reflect the successive levels of color vision as experienced by Primates, and would be independent of how color was perceived in invertebrates. (I’m sure no one contends that the color terms of human languages should capture the early chromatic experiences of invertebrates).
Although I do not buy the bottleneck theory of Mammal evolution -stock, barrel and lock – I believe we can take that as a reasonable starting point. It posits that mammals were reduced to being a nocturnal burrowing species during the age pf the dinosaurs and thus were reduced to having just the rods, and lost the earlier, cones, double cones and oil pigments that reptiles still have. In any case, in mammals, rods seem to be older and more conserved than cones. (Pat on the back: one claim originally made defended to satisfaction!!)
Amongst vertebrates, the rod opsin seems to be the most conserved; cone opsins have arisen principally by duplication and subsequent mutation of the rod opsin gene.
Also,
Which of the two primary classes, rods or cones, is the ancestral photoreceptor? Given the tremendous variation seen photoreceptors across vertebrate and invertebrate species, this in not an easy question to answer based on simple phylogenetic assumptions. In addition, it is often difficult to clearly distinguish certain rod and cone types from each other, or classify them into one or the other category. Rods appear to be relatively more conserved in vertebrates in terms of pigments and structure than cones, and therefore could be considered the more ancestral form. However, rods in some respects are more morphologically complex than cones, having developed extreme sensitivity (capable of detecting as little as one photon of light).
Now,coming to the evolution(or re-evolution) of the cones or the chromatic system in mammals, it is instructive to pause here and note that having three cones does not necessarily mean that the two species will have the same qualia of color hues.
If we restrict ourselves to animals which have the same number of receptor classes, might we expect that their color vision systems are equivalent? The answer is a resounding no. Let’s compare the color vision systems of two animals that both have three photopic (e.g. active under bright illumination) photoreceptor classes. One is the human, the other is the honey bee (specifically the worker–I don’t know how the other castes are endowed). Does anybody here think that what a bee sees when it looks at a rainbow has the same appearance as what we see? We’ll ignore optical polarization (which the bee is sensitive to and we’re not) and focus on what we can infer about “color” based on, among other things, our knowledge of the bee’s receptor classes. To begin with, at the inside of the rainbow where the violet-appearing light fades off to invisibility for us, the bee will still see more rainbow. On the outside, where we see red, the bee would see nothing for although bees have an ability to see what for us is UV, we have the ability to see what bees might call infrared.
Also, it is instructive to note here how a higher level chromatic vision (dichromatic for instance) may arise form a lower level chromatic vision (monochromatic in this example). Although, along with the photoreceptors, we will need additional supporting neural wiring, in both the retina and the brain, for the opponent-processing mediated color perception to take place, we will restrict the discussion to the emergence of a new photoreceptor.
A new photoreceptor, may come into existence by a duplication and polymorphisms of an existing receptor (opsin) gene. The new receptor would have a slightly different frequency sensitivity than the original receptor and, by selectively expressing these two genes in different receptors, we can have two types of receptors. By processing and combining the two types of signals, one can now get dichromatic vision, from the original monochromatic vision.
Much confusion, in primate color vision evolution, depends on the fact that one takes as base the other mammals like dogs, and their blue-yellow world as a baseline from where to start. It should be emphasized that even though dogs may currently have two receptors, tuned to detect blue and yellow, we cannot conclude form that anything about humans or ancient ancestral mammals. In the human ancestry lineage, the dichromatic phase may have involved Red-Green perception. This is evident form the bee-human trichromatic example given above.
A very good paper summarizing the latest research on primate color evolution concludes that their are five types of primate color vision systems- beginning with a Monochromatic (L opsin only)_ system in nocturnal primates to a S + M+L (multiple copies) trichromatic system in humans.
It is interesting to note here that the human Green evolved, by replication and polymerization of the Red opsin present on the X chromosome. From the hierarchy of primate color systems, it is reasonable to conclude, that initially when we were nocturnal primates, we had a dysfunctional S-opsin gene and a functional L gene- conferring us the ability to perceive the red qualia to some extent.
In diurnal prosimians, the S become functional and they have two qualia- that of red and blue.
In the new world monkeys, the L gene is polymorphic (it is on X chromosome and as explained in the paper, if we have two alleles for that L gene, that encode for slightly different frequencies, then as females have two X chromosomes, they can have both the alleles; the males meanwhile have only one X chromosome; so at at a time they can have only one of the alleles present. By X chromosome inactivation process, all cells of a female new world monkey, will have only one of the alleles; but different cells may have different alleles expressed and thus, the females may have 3 types of receptors (one S type and two L types), thus endowing them with trichromatic vision. The Males meanwhile will have dichromatic vision, but as the gene is polymorphic, we will differences in their dichromatic perceptions. This is exactly what is observed.
The old world monkeys, have the full apparatus for trichromatic vision- with one S and two L genes. The second L (or rather M as it detects green) gene was formed by replication and polymorphisms of the L gene that detected red. thus, they had the qualia of Red, Blue and Green.
Lastly, the humans, are more or less the same as old world Monkeys; but their L gene shows polymorphisms. This has the effect of making some females tetrachromatic (as this polymorphisms will only affect females- only they have two copies of X chromosome) and it seems , that by fortuitous replication, we might get a fourth cone type in all humans. Till then, this polymorphisms will explain some of the color perception differences that we may exhibit.
Suffice it to say, that the evolution of color terms should follow the same trajectory- with Black and White (rod based) color terms preceding Red, Blue, Green and Yellow color terms.
A final note of caution: only receptor types do not guarantee that the qualia experienced would change. In an experiment with mice, in which the mice were endowed with human pigments, they could not still learn to distinguish Red, as presumably the latter opponent-processing wiring, required for that qualia generation was not present/ couldn’t develop.
Thats all for now. Hope you found this post Eye opening!! Do let me know via comments of any incompatible/recent evidences and arguments.
Abstract vs Concrete: the two genders?( the catogorization debate)
In my previous posts I have focussed on distinctions in cognitive styles based on figure-ground, linear-parallel, routine-novel and literal-metaphorical emphasis.
There is another important dimension on which cognitive styles differ and I think this difference is of a different dimension and mechanism than the figure-ground difference that involves broader and looser associations (more context) vs narrow and intense associations (more focus). One can characterize the figure-ground differences as being detail and part-oriented vs big picture orientation and more broadly as analytical vs synthesizing style.
The other important difference pertains to whether associations and hence knowledge is mediated by abstract entities or whether associations, knowledge and behavior is grounded in concrete entities/experiences. One could summarize this as follows: whether the cognitive style is characterized by abstraction or whether it is characterized by a particularization bias. One could even go a step further and pit an algorithmic learning mechanism with one based on heuristics and pragmatics.
It is my contention that the bias towards abstraction would be greater for Males and the left hemisphere and the bias towards Particularization would be greater for Females and the right hemisphere.
Before I elaborate on my thesis, the readers of this blog need to get familiar with the literature on categorization and the different categorization/concept formation/ knowledge formation theories.
An excellent resource is a four article series from Mixing Memory. I’ll briefly summarize each post below, but you are strongly advised to read the original posts.
Background: Most of the categorization efforts are focussed on classifying and categorizing objects, as opposed to relations or activities, and the representation of such categories (concepts) in the brain. Objects are supposed to be made up of a number of features . An object may have a feature to varying degrees (its not necessarily a binary has/doesn’t has type of association, one feature may be tall and the feature strength may vary depending on the actual height)
The first post is regarding classical view of concepts as being definitional or rule-bound in nature. This view proposes that a category is defined by a combination of features and these features are of binary nature (one either has a feature or does not have it). Only those objects that have all the features of the category, belong to a category. The concept (representation of category) can be stored as a conjunction rule. Thus, concept of bachelor may be defined as having features Male, single, human and adult. To determine the classification of a novel object, say, Sandeep Gautam, one would subject that object to the bachelor category rule and calculate the truth value. If all the conditions are satisfied (i.e. Sandeep Gautam has all the features that define the category bachelor), then we may classify the new object as belonging to that category.
Thus,
Bachelor(x)= truth value of (male(x))AND(adult(x))AND(single(x))AND(human(x))
Thus a concept is nothing but a definitional rule.
The second and third posts are regarding the similarity-based approaches to categorization. These may also be called the clustering approaches. One visualizes the objects as spread in a multi-dimensional feature space, with each dimension representing the various degrees to which the feature is present. The objects in this n-dim space, which are close to each other, and are clustered together, are considered to form one category as they would have similar values of features. In these views, the distance between objects in this n-dim feature space, represents their degree of similarity. Thus, the closer the objects are the more likely that they are similar and the moire likely that we can label them as belonging to one category.
To take an example, consider a 3-dim space with one dimension (x) signifying height, the other (y) signifying color, and the third (z) signifying attractiveness . Suppose, we rate many Males along these dimensions and plot them on this 3-d space. Then we may find that some males have high values of height(Tall), color(Dark) and attractiveness(Handsome) and cluster in the 3-d space in the right-upper quadrant and thus define a category of Males that can be characterized as the TDH/cool hunk category(a category that is most common in the Mills and Boons novels). Other males may meanwhile cluster around a category that is labeled squats.
Their are some more complexities involved, like assigning weights to a feature in relation to a category, and thus skewing the similarity-distance relationship by making it dependent on the weights (or importance) of the feature to the category under consideration. In simpler terms, not all dimensions are equal , and the distance between two objects to classify them as similar (belonging to a cluster) may differ based on the dimension under consideration.
There are two variations to the similarity based or clustering approaches. Both have a similar classification and categorization mechanism, but differ in the representation of the category (concept). The category, it is to be recalled, in both cases is determined by the various objects that have clustered together. Thus, a category is a collection or set of such similar object. The differences arise in the representation of that set.
One can represent a set of data by its central tendencies. Some such central tendencies, like Mean Value, represent an average value of the set, and are an abstraction in the sense that no particular member may have that particular value. Others like Mode or Median , do signify a single member of that set, which is either the most frequent one or the middle one in an ordered list. When the discussion of central tendencies is extended to pairs or triplets of values, or to n-tuples (signifying n dim feature space) , then the concept of mode or median becomes more problematic, and a measure based on them, may also become abstract and no longer remain concrete.
The other central tendencies that one needs are an idea of the distribution of the set values. With Mean, we also have an associated Variance, again an abstract parameter, that signifies how much the set value are spread around the Mean. In the case of Median, one can resort to percentile values (10th percentile etc) and thus have concrete members as representing the variance of the data set.
It is my contention that the prototype theories rely on abstraction and averaging of data to represent the data set (categories), while the Exemplar theories rely on particularization and representativeness of some member values to represent the entire data set.
Thus, supposing that in the above TDH Male classification task, we had 100 males belonging to the TDH category, then a prototype theory would store the average values of height, color and attractiveness for the entire 100 TDH category members as representing the TDH male category.
On the other hand, an exemplar theory would store the particular values for the height, color and attractiveness ratings of 3 or 4 Males belonging to the TDH category as representing the TDH category. These 3 or 4 members of the set, would be chosen on their representativeness of the data set (Median values, outliers capturing variance etc).
Thus, the second post of Mixing Memory discusses the Prototype theories of categorization, which posits that we store average values of a category set to represent that category.
Thus,
Similarity will be determined by a feature match in which the feature weights figure into the similarity calculation, with more salient or frequent features contributing more to similarity. The similarity calculation might be described by an equation like the following:
Sj = Si (wi.v(i,j))
In this equation, Sj represents the similarity of exemplar j to a prototype, wi represents the weight of feature i, and v(i,j) represents the degree to which exemplar j exhibits feature i. Exemplars that reach a required level of similarity with the prototype will be classified as members of the category, and those fail to reach that level will not.
The third post discusses the Exemplar theory of categorization , which posits that we store all, or in more milder and practical versions, some members as exemplars that represent the category. Thus, a category is defined by a set of typical exemplars (say every tenth percentile).
To categorize a new object, one would compare the similarity of that object with all the exemplars belonging to that category, and if this reaches a threshold, the new object is classified as belonging to the new category. If two categories are involved, one would compare with exemplars from both the categories, and depending on threshold values either classify in both categories , or in a forced single-choice task, classify in the category which yields better similarity scores.
Thus,
We encounter an exemplar, and to categorize it, we compare it to all (or some subset) of the stored exemplars for categories that meet some initial similarity requirement. The comparison is generally considered to be between features, which are usually represented in a multidimensional space defined by various “psychological” dimensions (on which the values of particular features vary). Some features are more salient, or relevant, than others, and are thus given more attention and weight during the comparison. Thus, we can use an equation like the following to determine the similarity of an exemplar:
dist(s, m) = åiai|yistim – ymiex|Here, the distance in the space between an instance, s, and an exemplar in memory, m, is equal to the sum of the values of the feature of m on all of dimensions (represented individually by i) subtracted from the feature value of the stimulus on the same dimensions. The sum is weighted by a, which represents the saliency of the particular features.
There is another interesting clustering approach that becomes available to us, if we use an exemplar model. This is the proximity-based approach. In this, we determine all the exemplars (of different categories) that are lying in a similarity radius (proximity) around the object in consideration. Then we determine the category to which these exemplars belong. The category to which the maximum number of these proximate exemplars belong, is the category to which this new object is classified.
The fourth post on Mixing Memory deals with a ‘theory’ theory approach to categorization, and I will not discuss it in detail right now.
I’ll like to mention briefly in passing that there are other relevant theories like schemata , scripts, frames and situated simulation theories of concept formation that take into account prior knowledge and context to form concepts.
However, for now, I’ll like to return to the prototype and exemplar theories and draw attention to the fact that the prototype theories are more abstracted, rule-type and economical in nature, but also subject to pragmatic deficiencies, based on their inability to take variance, outliers and exceptions into account; while the exemplar theories being more concrete, memory-based and pragmatic in nature (being able to account for atypical members) suffer from the problems of requiring large storage/ unnecessary redundancy. One may even extrapolate these differences as the one underlying procedural or implicit memory and the ones underlying explicit or episodic memory.


There is a lot of literature on prototypes and exemplars and research supporting the same. One such research is in the case of Visual perception of faces, whereby it is posited that we find average faces attractive , as the average face is closer to a prototype of a face, and thus, the similarity calculation needed to classify an average face are minimal. This ease of processing, we may subjectively feel as attractiveness of the face. Of course, male and female prototype faces would be different, both perceived as attractive.


Alternately, we may be storing examples of faces, some attractive, some unattractive and one can theorize that we may find even the unattractive faces very fast to recognize/categorize.
With this in mind I will like to draw attention to a recent study that highlighted the past-tense over-regularization in males and females and showed that not only do females make more over-regularization errors, but also these errors are distributed around similar sounding verbs.
Let me explain what over-regularization of past-tense means. While the children are developing, they pick up language and start forming the concepts like that of a verb and that of a past tense verb. They sort of develop a folk theory of how past tense verbs are formed- the theory is that the past tense is formed by appending an ‘ed’ to a verb. Thus, when they encounter a new verb, that they have to use in past tense (and which say is irregular) , then they will tend to append ‘ed’ to the verb to make the past tense. Thus, instead of learning that ‘hold’, in past tense becomes ‘held’, they tend to make the past tense as ‘holded’.
Prototype theories suggest, that they have a prototypical concept of a past tense verb as having two features- one that it is a verb (signifies action) and second that it has ‘ed’ in the end.
Exemplar theories on the other hand, might predict, that the past tense verb category is a set of exemplars, with the exemplars representing one type of similar sounding verbs (based on rhyme, last coda same etc). Thus, the past tense verb category would contain some actual past tense verbs like { ‘linked’ representing sinked, blinked, honked, yanked etc; ‘folded’ representing molded, scolded etc}.
Thus, this past tense verb concept, which is based on regular verbs, is also applied while determining the past tense of irregular verb. On encountering ‘hold’ an irregular verb, that one wants to use in the past tense, one may use ‘holded’ as ‘holded’ is both a verb, ends in ‘ed’ and is also very similar to ‘folded’. While comparing ‘hold’ with a prototype, one may not have the additional effect of rhyming similarity with exemplars, that is present in the exemplar case; and thus, females who are supposed to use an exemplar system predominantly, would be more susceptible to over-regularization effects as opposed to boys. Also, this over-regularization would be skewed, with more over-regularization for similar rhyming regular verbs in females. As opposed to this, boys, who are usinbg the prototype system predominantly, would not show the skew-towards-rhyming-verbs effect. This is precisely what has been observed in that study.
Developing Intelligence has also commented on the same, though he seems unconvinced by the symbolic rules-words or procedural-declarative accounts of language as opposed to the traditional confectionist models. The account given by the authors, is entirely in terms of procedural (grammatical rule based) versus declarative (lexicon and pairs of past and present tense verb based) mechanism, and I have taken the liberty to reframe that in terms of Prototype versus Exemplar theories, because it is my contention that Procedural learning , in its early stages is prototypical and abstractive in nature, while lexicon-based learning is exemplar and particularizing in nature.
This has already become a sufficiently long post, so I will not take much space now. I will return to this discussion, discussing research on prototype Vs exemplars in other fields of psychology especially with reference to Gender and Hemisphericality based differences. I’ll finally extend the discussion to categorization of relations and that should move us into a whole new filed, that which is closely related to social psychology and which I believe has been ignored a lot in cognitive accounts of learning, thinking etc.
More From TheMouseTrap
- Schizophrenia and Autism: The Two Cultures.
- The Varieties of Altruistic Experiences
- The Two Cultures continued

Recent Comments