Humans show exceptional skill to communicate with fellow conspecific using verbal and gestural symbols. Such competence is arguably one of mankind's greatest assets as well as a key to most of our species' achievements. It brings us together with our peers and enriches us through an exchange of experiences, thoughts and value systems. It endows us with the means to acquire skills for situations that we may have never personally encountered before. A staggering range of regionally distinct languages and dialects, grouped in larger language families, not only serves as a system of communcation but also tags us for membership within a specific group. In addition to bringing us together, it has thereby been at the heart of an ancient source of division. Although few things in biology ever group cleanly into one of the nature vs. nurture extremes, this particular division seems to be purely cultural. Regardless of the specific race, ethnic or regional group that we happen to have joined through birth, we all are able to acquire competence in any human language. In particular, we seem to acquire our native language without formal education, long before many of our intellectual capabilities have matured, and simply by immersion into the particular language environment of parents and relatives during the first few years of our life. The scientific study of the nature and structure of languages is called linguistics.
BF Skinner (1957) suggested that infants learn language through a process described as operant conditioning, namely, via the monitoring and management of reward contingencies. Skinner's position would be that a four-term contingency analysis comprised of motivating operations, discriminative stimuli, responses and reinforcing stimuli would be the means by which behavior could be explained. In children and infants this process would be expanded by what he called "shaping", "prompting" and other stimuli modeling, imitation and reinforcing procedures. Language acquisition then is a process that would take thousands of instances of such training, and this appears largely to be what takes place. Critics who do not understand the inductive power of this approach largely assert that this view now "appears quite simplistic." However, this argument to complexity is not dissimilar to arguments against Darwin's theory of natural selection. How could the amazing complexity of the animal kingdom come about through such a simple means? Surely, we would need something more complex to explain the variety of animals? Darwin's theory of selection is now accepted as such a means, despite it's "apparent simplicity". Similarly the mechanism of operant conditioning can be seen as sufficient to account for very complex forms of behavior in a wide variety of circumstances without appealing to unproven, non-data driven speculative theoretical approaches like those of Noam Chomsky and others.
Language learning is clearly the most complex task any of us will ever undertake. Yet, the process appears to be considerably less painfull than acquiring an understanding of calculus, or organic chemistry. Noam Chomsky (1975) has long argued that this paradox is best explained by the view that humans, and in particular children, have innate abilities that support the acquisition of a language. It is clear that we seem to be naturally good at it, especially before we reach puberty. Moreover, we appear to have a natural need to fill our world with language; in the absence of formal language tutoring a form of language structure develops anyway. Some say a specialized language faculty seems to aid in this process, one that includes innately specified constraints on what forms are possible. These innate, language-specific, information processing mechanisms may be encapsulated in language module of the brain. However, these innate faculties are inferred, hypothesized explanations with no foundation in fact. No biological location has been found for them, no genetic location, no brain structure. It is all inference.
All human languages, even spontaneous ones, show many common principles of language acquisition as well as rules of grammar. The concept of universal grammar proposes that this is due to a set of innate rules, which guide how we acquire language and how we construct valid sentences in it. It thus attempts to explain language in general, and not simply describe the construction of any one specific language per se.
There are many alternative theories of human language consistency besides the speculative theory of "universal grammer". One theory is that human environments possess common structures and human language simply responds to the commonality of the world. Universal grammar is a speculative, unproven hypothesis that is still awaiting confirmation and has no evidence to support it other than "rational argument".
At very young age, we acquire our native language by listening to, guessing at the meaning of, and imitating the symbols used by tutors around us. Moreover, during these early years we learn to walk and talk without any explicit need for understanding how we are doing what we are doing. In this process we seem to be helped by a set of Inherent learning strategies, the ability for optimized pattern perception of common, ambient symbols.
Infants are exceptionally broad in their abilities to perceive sound qualities. In fact, as infants we can distinguish many more language sounds than we can as adults. During the first year of life, infant brains are actively engaged in optimizing acoustic perception for the language sounds that surround them. Such early acquisition of information about native language depends on perceptually mapping both the critical aspects of language, and statistical properties of speech.
It is now clear that infants perceive the various phonetic units, track the frequency of different formants, and extract the boundaries of words from running speech. Patricia Kuhl (2000) suggests that language acquisition is based on a combination of factors to provide a powerful discovery procedure for language. Evidence suggests that initial perception parses speech in a universal way in all human infants. Infants have inherent perceptual biases that segment phonetic units without providing innate descriptions of them. They were able to parse and discriminate a wide range of basic phonetic units. Adults, in contrast, are only able to discriminated phonetic units that occur in their first language, but fail to distinguish those that are not used there. Japanese adults, for example, fail to discriminate phonetic boundaries of r vs. l, boundaries that do not exist in Japanese. Such discrimination is based on general auditory processing mechanisms, rather than on innate phonetic feature detectors for speech. Language learning requires mapping these probabilistic patterns into language strategies. As infants detect frequency patterns in language input they identify higher-order units. Infants thus discover the critical parameters and phonetic dimensions of the sounds used in their native language. Sensory processing becomes optimized by experience for enhance perception of the specific language around them. Vocal learning unifies language perception and production where vocal learning depends on a comparison of one's own vocalizations to those of others. Imitation forms the integral bong between the perception and production of language abilities and together they become optimized for the first language. If a second language is learned lateron, it will carry the accent typical for the speech motor patterns of their primary language, even following long-term instruction. Similarities in infant-directed speaking styles (increased pitch and exaggerated stress) enhances language learning by assisting infants in discriminating phonetic units, as well as by capturing attention.
Broca's Area underlies the ability to produce speech, but it is not critical for understanding language. Patients with damage will fail to form words properly, and speech is halting and slurred. Wermicke's Area is essential for the ability to understand language. Patients with damage to this area can speak clearly, but the words make no sense (i.e., word salad). The Arcuate Fasciculus connects these two areas. Damage to this connection causes conduction aphasia where language is understood, but neither can words be repeated, nor does own speech make any sense. Capabilities for speech are not distributed evenly across the two halfs of the brain. Speech is only disrupted when amobarbital is selectively used to anesthetize only the half of the brain which contains these speech centers. Imaging techniques (e.g. fMRI) have identified that bilingual individuals utilize an overlapping set of neurons in the language areas for these two languages. In contrast, individuals who have acquired a second language later in life will likely rely on separate neuronal areas in these speech centers. Late bilungual speakers are also less likely to show strong lateralization of speech function. This suggests that as two language systems are learned together early-on, they can share the same brain centers without causing catastrophic interference. In adult learning, the best sites of brain real estate have already been taken up by the first language, thus, any new language learning must coopt 'new' territory adjacent to it or on the other half of the brain.
Birds communicate information about danger, food, sex, group movements and many other purposes via acoutic signals. A subset of these have been termed song, as they frequently feature with extended, tonal, melodic characteristics. The Zebra Finch's song, for instance, includes several introductory notes followed by a string of syllables within an extended melodious pattern. Sonograms (i.e., a plot of the intensity of pitch against time) are commonly used as a primary tool for studying and comparing bird songs.
Respiratory muscles force streams of air from large air sacs through the bronchi. Membranes in the syrinx vibrate as air expressed from bronchi passes over them. Syrinx muscles for left and right sound producing structures can act independently, and many birds are able to sing harmonies with themselves. Song appears to play a role in advertising for sex and species recognition as song complexity frequently coincides with the presence of ornate plumage. It also stimulates and synchronizes courtship behavior, stimulates reproductive readiness in females, and contributes to pair bond maintenance. Local song dialects exist in many species.
Successful song in most adult male songbirds depends on memorizing the calls of a conspecific tutor during an earlier, sensitive phase in life. The appropriate song repertoire is acquired in a series of distinct stages. Young birds, during an early Sensory Phase, listen to a conspecific tutor and thereby obtain information about the characteristics of its own song. Only a very specific subset of surrounding songs is actually accepted as suitable, suggesting the presence of an in-built song template. Following this sensory phase, young birds actively begin to vocalize themselves. Their Subsong is an atonal, noisy, meaningless repetition of sounds, which lacks recognizable syllables. Akin to human Babbling birds practice coordinated movements of the respiratory system, sound producing organs, and related structures (e.g., tongue). During Sensory-motor Phase, young birds spontaneously produce Plastic Song, consisting of vocalizations with distinct syllables and recognizable elements. Such "work in progress" will include elements from the song of tutors and elaborate them into a variety of syllables and phrases that even exceed what eventually will be used in its adult song. The ability to hear its own vocalizations are critical for normal development. In transition to the Mature phase, birds adopt a Crystallized Song with syllables and syntax structure that is characteristic of its species. Once established, these song patterns remain fixed in many species, are no more disrupted by deafening, and are presented intact during each subsequent breeding season. In contrast open-ended learners (e.g. starlings and canaries) retain the capacity to adjust or alter their song throughout life.
Song production is under the control of multiple hormonal systems from embryonic gonads. Injections of testosterone induce adult males to sing, even out of season, while similar injections in females have no such effect. The presence of estrogen during male development appears to be essential. When estrogen is blocked experimentally in developing males, testosterone injections fail to elicit song. However, when estrogen had been delivered to developing females, injection of testosterone elicited song in them (Konishi).
Several neural centers with a role in song have been identified. The Higher Vocal Center (HVc) is a group of neurons in the forebrain that is larger in (singing) males than in (non-singing) females. Damage to it blocks song production in adults. The nucleus of the archistriatum (RA) in males is larger than in females and its neurons increase in size and dendritic arborization during song learning. Damage to this area blocks song production in adults. The lateral magnocellular nucleus of the anterior neostriatum (LMAN) is neither sexually dimorphic nor shows seasonal change in neuron size or number. Its ablation in young birds interferes with song acquisition but its ablation in adults brings about few deficits as long as song had already been been learned prior to damage. Area X of the paraolfactory lobe (Area X) is sexually dimorphic and new neurons are added in song learning. Damage to it interferes with song acquisition in young birds but not in adults.
Zebra Finch Song and Tinbergen's Four Aims
As in every behavioral system, a series of independent questions can be addressed for song behavior in Zebra finches (Taeniopygia guttata).
Proximate Causation: Zebra finch song production requires the flow of air through semi-independent vibrators in syrinx and vocal tract. The presence of song, and song repertoire size are reflected in sexual dimorphism of its controlling brain areas and nuclei. A learning pathway esists separate of a motor pathway. Singing, which is largely restricted to males, is under the control of androgens.
Ultimate Causation: Song in Zebra finches is a learned vocalization used during courtship and defense of a territory. Advertising the individual's presence it serves to elicit mating opportunities from females and to stimulate the partner's reproductive behavior and physiology. Moreover, it functions to claim a territory and to repel competitors from it.
Phylogeny: Virtually all 9000 species of birds have the ability to vocalize, including crows, turkeys, owls or nightingales. A large subset of them, including the zebra finch, are characterized by complex vocal organs, distinctive brain circuitry for song, and acquisition of species-characteristic vocalizations through learning. Taxonomically these are all restricted to a single order - the Passeriformes.
Ontogeny: The emergence of adult zebra finch song illustrates the interactions of genetic and environmental factors in behavioral development. After periods of listening to the songs of tutors, starting its own partial vocalizations, rehearsing and adaptating its own song, the species-specific adult version slowly emerges. Song circuits exhibit extensive plasticity even in adults with ongoing neurogenesis and seasonal changes in neuronal morphology.
White-crowned sparrows (Zonotrichia leucophrys nuttalli) males sing a single song that shows considerable geographic variation in the form of stable dialects. Bilingual and blended strategies exist at the boundaries. The distinctiveness of the song depends on patterns of natal dispersal and the timing of learning. Subject to reinforcement by the song of neighbors, the system is highly dependent on auditory feedback. The work by Peter Marler, Doug Nelson and others for over 30 years illustrates how genetic and environmental factors interact during the development of a complex communication system. Males generally establish territories during late plastic song with a repertoire that consists of ~4 different songs. Improvisations yield individual-specific songs which closely match that of the nearest rival.
Cross-fostering experiments illustrate the role of auditory templates in song learning. Young birds reared in the presence of taped song will learn and present that song, even if the tape came from another species. A Song sparrow raised with a swamp sparrow tape will experience little difficulty to learn the swamp sparrow song. Birds in Isolation experiments are raised without access to intact adult song (i.e., no template) and will subsequently show deficiencies in their own song upon maturation. The song does nonetheless contain valid elements of intact adult song. Moreover, Deafening experiments, which deafen birds at hatching, results in song that still contains some valid elements but is an even cruder version than those of isolated birds. When Song preference experiments present young birds with a wide range of conspecific and heterospecific songs, they recognize and preferentially learn the song of its own species. Birds raised in Mixed syllables experiments in the presence of a mixture of swamp- and song sparrow syllables, will accurately produce these syllables in their song but lack the normal adult syntax.
Castration Experiments have shed light on the roles of hormones in song learning. Swamp sparrows that are castrated early in development have low testosterone levels compared to their male siblings. They acquire song but progress to plastic phase only. Treating such birds with injections of testosterone (Enhanced Testosterone Experiment) immediately crystallizes the song. Interfering with testosterone function in adult birds (Reduced Testosterone Experiment) degrades previously crystallized song back to plastic.
Brown-headed cowbirds (Molothrus ater) are gregarious birds that follows cattle herds. Brood parasites that are raised by parents of different species, no consistent, conspecific tutor available. So, how do they learn their own conspecific song? One of the strongest stimuli is the bird's own crystallized song and feedback from females is important (i.e., action-based learning).
Paralleles between Bird Song and Human Speech Learning
To learn their language, humans and white crowned sparrows follow similar steps, recognition, practice, and clarity. The best time for a human to learn their language is from toddler to twelve years old (Macdonald) and the best time for a white crowned sparrow to learn its song is between 10-50 days from when they were born. Both genes and environment play roles in determining the way each communicate (Alcock 25).
The first step for humans and white crowned species in learning their language can be considered recognition. White crowned sparrows and humans listen to a tutor before they begin to communicate their language. The tutor teaches the language and the particular dialect, according to the area. Just like humans, the sparrows also have a dialect depending on where they live. A white crowned sparrow has certain genes that only allow them to learn its own species’ songs. Experiments have been performed in which the sparrow was kept in isolation and played tapes of other song sparrows, the white crowned sparrow will not imitate the other species’ song, but will sing an odd song unlike either species’ song (Alcock 24-25). Humans also cannot communicate with each other using a different species’ language. A study was done in Avignon, France that observed children who were brought up by wolves. It was observed that the children spoke no language at all. Before studies of this kind were done it was thought that the children might learn to communicate with the wolves, just like the Tarzan story. This is not true, the children could not speak the human language nor the language of the wolves (Macdonald). It is a combination of environment and genes that tell white crowned sparrows and humans to learn their proper language. Genes tell the species to only learn the language of their own species and environment plays a factor in determining which dialect each will use.
Once the white crowned sparrow or infant has recognized its species, song, they can begin to practice the language. Infants first communicate by making sounds. The sounds can range from grunts to sounds that mimic their surrounding environments. For example, before babies can say words, they might say moo- moo when looking at a cow. After sounds come words. By the time the infant reaches one year, they should be making sentences out of words (Macdonald). The white crowned sparrow also does not immediately master its song. The sparrow will first sing a short subsong derived from the tutor’s full song. The sparrows keep practicing their subsong just as infants practice their words. After the sparrow has mastered its subsong it can start to form a full song. In both species it is necessary for them to hear themselves in order to vocalize their language correctly (Alcock 26).
In conclusion, both white crowned sparrows first listen and recognize their particular species’ language. A white crowned sparrow has a tutor to teach the song. Infants usually have parents that teach them their verbal language. Then they must practice the language, starting off slow with sounds and building up. Both species will continue to improve and clarify their language throughout their lives.
- Alcock J. 2001. Animal Behavior. 7th ed. Sunderland: Sinauer.
- Chomsky N. 1975. Reflections of Language. New York: Pantheon Books
- Konishi M. 1989. Birdsong for Neurobiologists. Neuron 3: 541-549.
- Kuhl P. 2000, A new view of language acquisition. PNAS 97(22): 11850–11857
- Macdonald A. 2003. The Beginnings of a Spoken Language. New Orleans. 1 Sept.
- Skinner BF. 1957. Verbal Behavior. New York: Appleton-Century-Crofts