Speech Science Resource Pages

Autosegmental Phonology

Jonathan Harrington

Click here for a print formatted PDF version of this topic


Important: If the two members of any of these pairs of phonetic tokens ([ & ʃ], [ & ], [ & ], [ & n͡o], or [ & o͡o]) differ greatly from each other in shape, then click here for instructions on how to set up the phonetic font.

Reading

  1. Halle & Clements (1983, p. 11 - 15).
  2. Clark & Yallop (1995, p. 344-347).

Autosegmental phonology

In the Generative Phonology that followed from Chomsky & Halle's (1968) publication of the Sound Pattern of English (SPE), morphemes are represented as underlying representations (URs) that consist of underlying units each defined by a distinctive feature matrix. In the SPE model, there are phonological rules that apply to URs (such as to the prefix /məŋ/ in Indonesian that we considered earlier) and that then convert these into one or more surface phonetic forms (such as [məm]) We have also seen that there are often multiple rules, which implies that the UR may have to go through many different kinds of transformation before the appropriate phonetic form is reached.

The most important development since the SPE model of phonology is to recognise that URs may be multidimensional in which segments are arranged on separate or autonomous levels or tiers (hence the name auto-segment). Phonological rules could apply independently to the segments at these autonomous levels, although the segments always remain linked to each other.

The development of multidimensional phonological representations was largely motivated by the realisation that the SPE framework, in which URs consisted of linear (one-dimensional) segments, was not able to explain adequately certain properties of tone languages, nor in fact various aspects of prosodic phonology such as lexical stress in English (see the famous paper by Liberman & Prince, 1977 on stress and linguistic rhythm). We consider some data from Margi (a tone language) that can be used to argue for URs that have segments and tones arranged on independent tiers.

Consider firstly a fairly simple example from English for the autonomous representation of syllables and phoneme-size segments. As you may know, there are at least two different ways of saying the first few sounds of a word like 'parameter'. The first is the citation form way recorded in most dictionaries in which the word has four syllables and, the first three segments are [pəɹ] i.e. with a medial schwa. The second, and more common way in fact, is to not produce the medial schwa and to have a syllabic [ɹ̩] segment i.e. [pɹ̩æmətə]. Notice that this initial cluster is quite different phonetically from the initial segments in monosyllabic 'pry' in which the /r/ is realised as a voiceless approximant: in 'parameter', we still have four syllables but the schwa has been deleted.

Let's see why the two variant pronunciations [pəɹ] and [pɹ̩] are a problem in an SPE type model. Assuming that we have the form with the schwa in the underlying representation, we have to write a rule which is going to delete the schwa and yet somehow keep the [+syllabic] feature (from the schwa) which then gets carried over onto the [ɹ] segment. Specifically, remember that the schwa's distinctive feature matrix must include (amongst other things) the features [-consonantal, +syllabic]. What we would like to be able to do is to delete every feature except the [+syllabic] one, and then attach the [+syllabic] feature to the feature matrix of [ɹ] so that it becomes [+syllabic] (remember that initially [ɹ] will be defined as [+consonantal, -syllabic]). But the problem in SPE is that, when you delete segments, you also have to delete their entire feature matrices (you can't retain bits of a feature matrix). Therefore when the schwa is deleted so is its entire feature matrix, including the feature [+syllabic] that we want to keep.

An intuitively more satisfying explanation is to represent the syllabic feature and the segments on autonomous levels and then to have a rule that deletes the schwa without affecting the syllabic level. That is, in our UR, we would start out with a representation like the one immediately below. This has two separate levels which are linked by association lines (the vertical lines).

We would then have a rule that deleted the schwa at the segmental level while leaving the units at the syllabic level unaffected (and therefore leaving a dangling, or unattached, association line):

A rule would then associate the unattached [syll] segment rightwards to the nearest sonorant segment, i.e. to [ɹ]:

We have thereby deleted the schwa, retained the same number of syllables, and also marked the [ɹ] segment as syllabic (by its association line to [syll]).

An important implication of the autosegmental treatment, which represents a radical departure from the Sound Pattern of English, is that not all features are necessarily properties of segments. For example, in SPE, lexical-stress in English was a property of a vowel segment: in 'pattern', the /æ/ vowel was marked as [ 1 stress] (primary stress) - thus [ 1 stress] was a property of this vowel, in the same way that the feature [+low] formed part of this vowel's feature matrix. By analogy, the same applied to tone languages: a vowel's feature matrix might be marked as [level-high] for syllables produced with a high-level tone (and that's why, if a vowel was deleted, its stress and tonal features were also necessarily deleted, since these were simply considered to be inherent properties of the vowel's feature matrix). What the autosegmental treatment does is to take some of the features out of the segment and to put them onto their own level or tier so that rules that apply at the segmental level need not necessarily apply to them as well.

The value of this autosegmental analysis was clear in Goldsmith's (1976) analysis of tone in African languages. He argued for an autosegmental representation on the grounds that there are phonological rules that apply independently to the tonal and segmental levels. In one of the well-known examples, he showed that there are rules that delete a segment but which can leave a tone that is associated to a segment unaffected. The data is shown below. First, however, we will need to learn some new notation to interpret the tones, as follows. High and low level tones are marked with an acute and grave accent respectively, thus:

  é high level
è low level

These are level tones. A rising tone on the same vowel is equivalent to a low tone immediately followed by a high tone. This is reflected in the notation by combining the low and high tone representations as follows:

  rising

By analogy, a falling tone (High followed by Low) is marked as:

  falling

and in the few occasions when we need to mark a mid-level tone, we will use

  mid-level

Goldsmith's data was from Margi (a language spoken in Nigeria), in which he analysed phonologically the change in the tonal shape of an [ari] suffix when segments were deleted. The addition of an [ari] suffix turns an indefinite into a definite. As these examples show, when the indefinite ends in a consonant, there is no change to the definite and the suffix is always [árì]:

  indefinite definite
sál sálárì 'man'
kátsákár kátsákárárì 'sword'
àɡám àɡámárì 'ram'
kùm kùmárì 'meat'

When the definite ends in a vowel, there are various changes both to the vowel of the suffix and to its tone. Consider first the case when the final vowel of the indefinite ends in a high tone:

  indefinite definite
kjárì 'compound'
kú kwárì 'goat'
táɡú taɡwárì 'horse'
ʃèré ʃèrérì 'court'
tóró tórórì 'threepence'
ncàlá ncàlárì 'calabash'

The above cases can in fact be handled in a linear SPE model. We can have a rule which turns a high vowel into a glide when it precedes a vowel (thus [i] and [u] become [j] and [w] respectively) and in which the vowel of the stem replaces the vowel of the suffix elsewhere: thus [toro+ari] becomes [torori], [ʃere+ari] becomes [ʃereri] etc.).

Now consider the following cases in which the indefinite ends in a vowel with a low tone. We see exactly the same segmental changes as before (change a high vowel to a glide, replace the vowel of the suffix with the last vowel of the indefinite elsewhere), but additionally, the tone of the penultimate vowel changes from high to rising:

  indefinite definite
fà fǎrì 'farm'
tjǎrì 'mourning'
hù hwǎrì 'grave'
cédè céděrì 'money'

These cases are more difficult to handle in an SPE framework in which the tone would be assumed to be an intrinsic part of the vowel's feature matrix. This is because a closer analysis shows that the vowel of the suffix is deleted although its tone is not. For example, we can analyse [céděrì] as:

  cédè + árì

Now delete the vowel of the stem, as before, but do not delete the tone:

  cédè +  ́ rì

We are left with an unattached high tone (the ́ tone which is left after the [a] vowel has been deleted). We now need a rule which says that unattached tones must be attached to the nearest vowel. Consider in the light of this that the final vowel of [cédè] is low (L) and the unattached tone is high (H). The combination of the two gives a low tone followed by a high tone (L then H) which is the same as a rising tone (or equivalently: cédè + ́ becomes cédě). Phonologists would deal with these rule changes in an autosegmental representation in which tones and segments appear on separate levels.

For example, the UR for the definite form of 'farm', [fǎrì], would be:

A rule applies to the segmental level to delete the vowel of the suffix when it is preceded by a vowel:

We now have an unattached H tone which is associated to the nearest vowel:

In other words, we have an LH (=rising) tone attached to the first vowel i.e. we correctly derive [fǎrì].

The cases in which the vowel of the indefinite is turned into a glide ('mourning', 'grave') can be dealt with in a similar autosegmental way, but in this case we have to have a rule which says that tones can only associate to vowels. Specifically, the UR for the definite 'grave' would be:

We apply a rule to the segmental level which says that high vowels turn into glides when they precede a vowel:

But we now have a tone (L) associated with a consonant [w], which is disallowed (because tones can only associate to vowels). We therefore need a rule at the tonal level that will link the L tone to the nearest vowel (i.e. to the following [a]):

= [hwǎrì]