Saturday, April 25, 2009

Vocal Responsiveness

In human conversation, individuals appear to take turns in language production, which is characterized by acoustically distinct sounds (Arcadi 2000:206). In contrast, although simple vocal exchanges involving distinct calls have been documented in some primate species, these calls appear to be acoustically similar (ibid.). Mitani and Brandt (1994:250) concluded that there is no evidence that chimpanzee vocal behaviour is different from that of any other primate. However, others (i.e. Burling 1993; Ujhelyi 1998) have stated that chimpanzee vocal behaviour is more ‘sophisticated’ than that of other primates, and therefore may provide insights into the origins of language. However, acoustic analysis has revealed that some primate calls which sound the same to human observers are in fact distinct vocalizations that are employed in different contexts and elicit different behavioural responses (Arcadi 2000: 218; Zuberbühler et. al. 1997:601; Cheney and Seyfarth 1982:748).

Seyfarth and Cheney (1997) found that the ‘grunts’ of female baboons served as a reconciliatory signal because they reduced the anxiety of lower ranking females’ after aggression. In vervet monkeys, Cheney and Seyfarth (1982) have identified four different social situational ‘grunts’: upon encountering a dominant conspecific, upon encountering a subordinate conspecific, to a conspecific moving to an open area and to vervets who are not members of the group. Seyfarth and Cheney (1997) also observed that vervet ‘wrr’ calls were used to indicate that a neighbouring group had been seen and ‘chutter’ calls indicated that an intergroup encounter had become aggressive.

In 1996, Arcadi spent 53 days observing a small group of chimpanzees (10 adult males, 8 central adult females and 9 peripheral adult females) which had never been provisioned, in Kibale National Park, Uganda. Arcadi (2000:213) found that wild chimpanzees vocalize at low rates, tend not to respond to calls that they hear, and when they do respond, they tend to give calls that are similar to the ones they have just heard. She noted that calling rates were higher when other calls were audible, and temporally clumped calling within and between subgroups typically involved either chorusing or counter calling with calls of the same type as those just heard (Arcadi 2000:217).

Social Learning

Learned vocalizations function as an indicator of group membership. Chimpanzees and modern humans live in social groups characterized by within group cooperation and competition between groups. Such social systems put a premium on reliable indicators of group membership, vocal or otherwise (Tomasello and Call 1997). An alternate possibility is that vocal learning is just one example of a domain general mimetic ability in modern humans (Fitch 2000).

Janik and Slater (2000:8) define three different types of social learning: contextual, production and vocal. Contextual learning affects the behavioural context, both usage and comprehension, of a signal. Production learning refers to instances where the signals themselves are modified in form as a result of experience with those of other individuals. Vocal learning is defined as production learning in the vocal domain. It can affect one or more three systems which involve different levels of control over sound production (ibid.): respiratory, phonatory and filter. According to Janik and Slater (2000), contextual learning and respiratory production both preceded the evolution of phonatory and filter production learning.

At the same time, the most important primate specific tool for expressing emotions is the facial gesture. Producing modifiable facial gestures has an immediate communicative function” (Ujhelyi 1998:180). As secondary result, acoustic variations, arise if an animal changes its facial gesture during vocalization. Primates have mobile, nonattached upper lips, which enable them to produce different facial expressions. Different facial expressions include lip configurations which form barriers to the passage of air (Burling 1993:30). Thus they may result in different acoustic outcomes. Although a human like vocal tract is absent, the face will be the main tool for producing articulatory variants (ibid.).

Since most vertebrates can distinguish the vocalizations of different individuals (parent/offspring, conspecifics, strangers), formants, the ‘shaping’ of sound due to physiological constraints, plays an important role in individual identification (Firth 2000:263; Burling 1993). Nonhuman primates also have formants in their calls, which vary with context (Janik and Slater 2000:9). Formants might also provide an indication of the body size of the vocalizer. Firth (2000:263) has proposed that there is correlation between vocal tract length and body size in humans and monkeys. However, formant cues are different than that of vocal pitch, which is not correlated with body size in humans (Lieberman and McCarthy 1999:488).

One could presume that early sound making would have been accompanied by rudimentary vocal facial displays such as those produced by modern nonhuman primates. Later changes, building upon inherent preadaptations, expanded from the strictly prosodic domain into the articulatory domain, and subsequently, the discovery that these sound making movements could be combined in various ways to produce a range of distinctive phonetic forms (words).

Duets

A duet is defined as long calls or songs in which both sexes of a monogamous pair produce loud sounds in an interactive manner, performing a mutually cooperative and coordinated display (Arcadi 2000; Ujhelyi 1998). In some primate species, mated pairs sing ‘duets’ and neighbours ‘counter sing, in an alternating but apparently timed manner. For example, chimpanzee long calls are thought to maintain connections with group members, gibbon duet songs are performed at given times of the day, and indris duets are produced only during the breeding season (ibid.). In gibbon duets, the contribution of males and females to song display show a rather rigid and uniform pattern (Arcadi 2000). Although there are some instances in which song transfer may occur: for example, when a female becomes widowed, she may adopt and perform the male song and so produce a pseudo duet the strong sexual differences in song structure are likely to be genetically programmed (Ujhelyi 1998).

Despite the strong correlation between the duet performance and monogamy, coordinated call display does exist in great apes. The common chimpanzee males often call together, while in bonobos, a male-female pair duet occurs (Ujhelyi 1996, 1998). In both chimpanzee species, duetting or chorusing can be heard all day in relation to different activities (Ujhelyi 1998:184). It has been proposed that duetting is a definitive feature of stable monogamous and territorial primate species, and that the function of duetting may be the maintenance and reinforcement of the pair bond (Ujhelyi 1998:185). Differences in partner preferences are due to differences in group structures between the two chimpanzee species. Ujhelyi (1998) observed that bonobos show a high degree of synchronization between vocalization of different individuals. Ujhelyi (1996, 1998) further postulates that the capacity of duet performance might be a remnant of the earlier monogamous stage, and altered to fit in the current way of life.

Mitani and Brandt (1994:250) observed that chimpanzee males attempt to match the acoustic characteristics of each other’s vocalizations when calling together. Single males appear to alter the acoustic structure of their calls when chorusing with different partners. This tendency results in large variability in call types on the one hand, but homogenization in call repertoire of the group on the other hand (ibid.). According to Marshall et. al. (1999:826), the call repertoire being acquired by a single male may contain a large number of variants mostly acquired via social learning, while the call repertoire itself is not exclusive to a specific individual.

Long calls

Territorial song marks the territory of a group and serves to maintain spacing between members of neighbouring groups. Due to its acoustic nature, it is impossible to mark territory directly, instead the presence, identity and location of the territory owner are broadcast (Ujhelyi 1998:184). Ujhelyi (1998:185) considers this to be a representational function. According to Ujhelyi (1998:180) this territorial behaviour establishes lexical syntactic capacity. If the labeling channel is an acoustic one, and the primary sounds are limited and genetically fixed, then differences in signs “can only be achieved by compositions of the invariant elementary sounds and by varying their arrangement.” In contrast, species capable of producing within call acoustic variants live in large groups with complex social interactions (ibid.).

All the primate species, excluding African apes, which long, variable calls share a common feature in their social behaviour, namely monogamous territoriality (Arcadi 2000:216; Ujhelyi 1996:74). In contrast to other territorial mammals that rely upon olfactory marking, most nonhuman primate species mark and defend individual territory by acoustic signs (ibid.). These distinctive loud calls are given by males as territorial displays, and these can elicit similar calls from one or more conspecific (Ujhelyi 1996, 1998; Tomasello and Call 1997). These long calls or “songs” are displayed without any overt external stimulus and have some musicality in nature. It has been proposed that the songs of different species represent different degrees of complexity (Burling 1993).

Long calls are built from smaller, stable, clearly distinguishable units, and exhibit individual variation over time (Burling 1993; Ujhelyi 1996, 1998). The units of these long calls are ‘traditional’ communicative signals (i.e. alarm or contact call) which are combined in different arrangements in different compound calls. The number of elementary units in long calls differs across species, and acoustically different songs can be created by changing the number, type and position of elements (Ujhelyi 1998:179). According to Mitani and Marler (1989:43) the gibbon song may be divided into distinct vocal elements. Based upon seven variables (duration, maximum frequency, minimum frequency, frequency range, start frequency, end frequency and number of frequency inflections), 13 basic note types can be distinguished (ibid.) The songs are then built up from these notes. The songs can be varied using different type, number and positions of elements, segments or notes. Hence, the song repertoire of an individual male may be rather large.

Both the common chimpanzees and bonobos have long calls, which can be divided into some acoustically distinct segments, similar to gibbon songs (Mitani and Marler 1989; Clark and Wrangham 1994). Chimpanzee vocalizations are highly graded with many variants used in a wide range of contexts (Arcadi 2000:205). It has been noted that these vocal sequences can be long and involve many call types (ibid.; Ujhelyi 1998). Additionally, extended vocal exchanges between individuals out of visual contact are common (Arcadi 2000:206). Chimpanzees and bonobos also emit long, compound calls (pant hoot and high hoot, respectively) which can be divided into acoustically distinct segments (ibid.). Although chimpanzees do not change the order of the four fundamental units of the long call, they insert individually selected vocal elements into different positions of the call (Burling 1993). Chimpanzee males often give the long call together, during which they attempt to match the acoustic characteristics of each other’s vocalizations. The matching tendency shows that call variants can be learned (Arcadi 2000:206; Ujhelyi 1998:185-186).

Ujhelyi (1998:179) has proposed that this type of call variant production “may represent phonological syntax since the altered parts of the call do not possess their own meaning independent of the call”. In other words, this type of call production may represent an intermediate stage between animal communication and language. As a result of living in a large social group, the ability to create syntactically different calls was enhanced. A call repertoire emerged which contained a large number of call variants at group level available for each group member via social learning. Ujhelyi (1996, 1998) believes that this type of animal call is different from ordinary animal communication since it apparently demonstrates some features of human language.

Zuberbühler et. al. (1997:601) found that the long distance calls of diana monkeys function in perception advertisement as well as within group semantic signals that denote different types of predators. Subjects were a group of 20-25 individuals (1 male, 5-7 adult females, subadults and infants) in the Taï National Park, Côte d’Ivoire. It was observed that diana monkeys show age/sex dimorphism in the vocal repertoire (Zuberbühler et. al. 1997:591). Adult females, subadults and juveniles accounted for most of the vocal activity in the group, and are responsible for the following vocalizations (ibid.): contact call, trill, alert calls (leopard, eagle) and agonistic calls directed in both intra and intergroup interactions. Male diana monkeys tended to restrict their vocal communication to long distance calling to which females responded with they own, acoustically different, alarm calls (ibid.). According to Zuberbühler et. al. (1997:601), the long distance calls in nonhuman primates show acoustic specialization. Calls are structurally stereotyped and are given repeatedly (ibid.).

It seems that it is just this territorial behaviour which first established the linguistic capacity. If the labeling channel is an acoustic one, and the primary sounds are genetically fixed. Then only by varying the elementary sounds can sign differences be achieved. Consequently, those individuals who are capable of linking, repeating, and combining these elements get selective advantages. It can be shown that some of the notes of gibbon song occur independently of song, in another context reaction to encounters (Mitani and Marler 1989). These simple elements function in ordinary communicative situations. The combination of available elements resulted in a variable set of songs which became suitable for territorial marking.

According to Zuberbühler et. al. (1997:601), in rain forest habitats, where visibility is generally poor, the acoustic domain may provide the most efficient means by which a prey animal can communicate to a predator. In the case of diana monkeys, calls are given only to hunters which surprise their prey (leopards, eagles) and not to hunters that pursue their prey (chimpanzees, humans) (Zuberbühler et. al. 1997:602). Secondly, calls given in the this contexts are regularly combined with approaching the predators both under experimental and natural conditions (ibid.).