Cross-Modal Effects in Repetition Priming:
A Comparison of Lip-Read, Graphic and Heard Stimuli
by
Barbara Dodd, Michael Oerlemans and Ray Robinson.
Speech and Language Research Centre
Macquarie University
North Ryde, N.S.W.
Australia.
Abstract. A series of experiments investigated the processing of lip-read information, as compared to that of heard and read stimuli, using the repetition priming paradigm. Experiment 1 showed that lip-read priming facilitated the semantic categorisation of lip-read words to the same extent as that found for auditory and graphic stimuli. Experiments 2, 3 and 4 measured the effects of cross-modal priming. Lip-reading primed both auditory and graphic processing, and is primed by both. While auditory priming did not speed the processing of graphic stimuli, graphic priming facilitated the semantic categorisation of heard words. A tentative explanation of the findings is offered: lip-reading provides incomplete information about words, and thus there is a need to access stored linguistic knowledge to 'fill in' missing features, allowing identification of the stimulus.
Until recently it was assumed that human short term memory was organised along modality specific lines. That is, heard stimuli are processed separately and differently from seen stimuli. The bases for this assumption were the results of experiments showing no cross modal interference in short term memory tasks. For example, when a list of digits has to be recalled in serial order, there is enhanced accuracy of recall for the last items of the list if it has been heard, as compared to that found for read lists (Morton and Holloway, 1970). Another paradigm that has provided evidence of modality specific processing is repetition priming. Subjects are asked to perform a verbal task (e.g. lexical decision). Some of the stimuli occur twice. Reaction times show that subjects can perform the task more quickly the second time they process words, but only if the word is presented to the same modality on both occasions (Clarke and Morton, 1983). Wood (1974) found that when two successive stimuli from two senses have to be compared, the stimuli are matched in the modality code of the second stimulus. For example, when a seen letter E was followed by a heard C, interference was more likely to occur than when a heard E was followed by a seen letter C. In the first case there is phonological similarity in the second there is no visual (graphic) similarity. The results were interpreted as an indication that stimuli perceived by different modalities are recoded into the modality specific code of the second stimulus, for comparison.
One common feature of these experiments is that the visual stimuli were presented graphically. While reading and picture identification are important functions of the visual system, they are not representative of all visual information. Graphic and heard stimuli differ not only in that one is perceived visually and the other auditorially, but also in that graphic information is static and can be perceived as a whole, whereas auditory information is dynamic, i.e. consists of a set of serially ordered features that change over time. One type of visual stimuli that shares this latter characteristic, and is also verbal, is lip-read information.
Normally hearing subjects can lip-read approximately 25% of a silently presented word list correctly (Dodd, 1977). When a close set is presented (e.g. numbers, colours, names) subjects make few, if any, errors. It is therefore possible to use lip-read stimuli in experimental paradigms such as serial ordered recall, priming, and attribution of modality. Recent experimental research has shown that cross modal interference effects do exist if the visual stimuli are lip-read rather than text read. There is enhanced end of list recall for lip-read lists, and cross modal suffix effects (Campbell and Dodd, 1980; Gardiner, Gathercole and Gregg, 1981). Auditory presentation primes lip-read performance, and vice versa (Campbell, in press); and subjects have more difficulty remembering if they have lip-read or heard a word, than they do remembering if they have heard or read a word (Dodd and Campbell, 1984).
Results like these have lead to the conclusion that lip-read and heard speech share a degree of common processing (Summerfield, in press; Dodd and Campbell, 1984; Campbell, Dodd and Brasher, 1983). In normal, face to face, verbal communication, information from two sensory modalities is available. Seeing speech, i.e. lip-reading, effects what is heard. When seen and heard speech differs, our perception of the stimuli is not strictly in the domain of either stimulus, but is an integration of the two (McGurk and MacDonald, 1976). For example, when subjects hear /ba/, but see /ga/, they perceive /da/. McGurk and MacDonald ( 1976) concluded that the response "revealed the interactive relationship between seeing and hearing" (p. 253). This implies common processing of verbal auditory and lip-read stimuli. The questions arising from this conclusion concern the levels of processing at which lip-read and heard speech are integrated, and the coding relationships between lip-read, heard and qraphically presented words.
Experiment 1
While it has been demonstrated that within modality repetition priming effects can be obtained for auditory, and visual (graphic) stimuli, there has been no investigation of whether prior lip-reading experience can prime lip-read performance. Before the cross modal priming effects between lip-read, auditory and graphic stimuli can be assessed, it is necessary to establish whether lip-read information is subject to within stimulus-type priming effects, and if so, the extent of the effect compared to that found for auditory and graphic stimuli.
METHOD
Subjects.
Twenty unpaid volunteers, 6 female and 14 male, acted as subjects. They were students or staff in a University department. All had Australian-English as their native language, and were aged between 18 and 45 years of age. None had any detected hearing loss, or uncorrected visual impairment.
Procedure.
Subjects participated in one experimental session, lasting less than half an hour, in a sound proof room. They sat facing a V.D.U., and were given a small hand held panel on which there were two buttons. One button was labelled "A" for animal, the other button, "P" for plant. Subjects were told that they would see/ hear/lip-read words, and that they were to decide whether each stimulus word was an animal or a plant, and to press the appropriate button as quickly as possible. The need for accuracy was stressed.
Each subject performed three conditions: auditory priming task, auditory test; graphic (written words) priming task, graphic test, and lip-read priming task, lip-read test . The task was identical in both priming and test phases of the experiment i.e. categorisation of words as either animal or plant. The test phase followed the priming phase after an interval of approximately two minutes. There were 10 items in the priming list, and 20 ( 10 primed and 10 unprimed) in the test list. The order of presentation of the three conditions was randomised across subjects.
Stimuli.
There were three lists of twenty words, matched for frequency of occurrence (Thorndike and Lorge, 1944) and syllable length. There were ten animal and ten plant words in each list. Words for the lip-reading condition were carefully chosen in terms of their "lip-readability". This was achieved by limiting the words to a closed set (Australian native animals and common plants). The final "to be lip-read" word list was determined by asking 20 students to lip-read a long list of words, and selecting only those words that were lip-read accurately by at least 16 of the students in a noisy, distracting environment (the enrolment hall). Only one experimental subject showed significant lack of accuracy when lip-reading the stimuli, he was replaced. The other two word lists, for graphic and auditory conditions, were alternated. That is, while each subject had the same stimuli in the lip-read condition, half the subjects heard one of the other lists, which other subjects read. Each list of twenty words was further divided into two lists of ten containing five animals and five plants: List A and List B. Half the subjects received List A in the priming task, and half List B.
In both priming and test phases the stimuli were presented at a rate of one every 5 seconds. In the lip-reading condition, subjects watched the V.D.U. showing a life size head in colour which remained on the screen for the entire stimulus presentation, but in the interstimulus interval the presenter did not look at the camera. The presentation was silent, since no audio track had been recorded. In the graphic condition (S100 microcomputer driven) subjects saw words in upper case (height: 1 5mm), in bold, enclosed in a box in the centre of the screen. The word was visible for .5 of a second, approximately the length of time taken to say each word. In the auditory condition subjects heard the stimuli through headphones.
Measurement.
The dependent variable was reaction time. When subjects pressed a button to categorise the stimuli as animals or plants, they stopped a timer that had been activated as each stimulus was presented. In the auditory and lip-read conditions this was done by placing a tone (not heard by the subjects) on a second channel of the stimulus tape that coincided with the onset of the stimulus. In the graphic condition, presentation of the stimulus item activated the timer. At the end of each test phase the printer provided a sheet stating subject name, order of condition presentation, prime list (A or B), condition tested, and each stimulus word of the test list, with subjects' categorisation choice, and reaction time.
RESULTS.
Each subject's data were analysed to provide the mean reaction time for the ten primed and ten unprimed words in each condition (Figure 1). A two factor Analysis of Variance (condition: lip-read, auditory, graphic, priming: primed or unprimed) was used to analyse the data. The conditions term was significant (F=178.1, df 2,34, p<.001). Post hoc tests indicated that reaction times were shorter in the graphic condition than in the auditory condition; reaction times in the lip-read condition were longest. The priming term was also highly significant (F=37.5, df 1,17 p<.001). Priming resulted in consistently faster reaction times. The interaction term was not significant, indicating that the extent of the priming effect was the same for all three condtions. Accuracy was high in all conditions. The mean number of categorization errors for the lip-read condition was 3.7, 0.65 for graphic condition, and 1.1 for the auditory condition.
FIGURE 1
DISCUSSION.
Experiment 1 showed that lip-read stimuli can be successfully used in the repetition priming paradigm. Although reaction times in the lip-reading condition were significantly longer than those for the graphic and the auditory condition, the extent of priming advantage did not differ across conditions. One contributing factor to the short reaction times for graphic stimuli is that they can be perceived as a whole at the moment of display, whereas identification of both auditory and lip-read stimuli awaits the completion of the stimulus presentation. The finding that lip-read stimuli take longer to process than auditory stimuli may result from lip-read stimuli providing incomplete information. That is, subjects need to "fill in" missing features, e.g. voicing, from stored representations of words. This additional processing would result in longer reaction times.
Experiment 1 provides evidence that it is possible to use lip-read stimuli in cross modal repetition priming experimental designs. A pilot experiment had indicated that lip-read and auditory verbal stimuli elicited cross modal priming effects (Campbell, in press). This preliminary finding was not surprising since interference effects between these two types of stimuli are easily obtained in other short term memory paradigms. Other researchers have demonstrated that auditory priming does not speed the processing of graphic stimuli ( e.g. Clarke and Morton, 1983). Experiment 2 tested the effect of auditory priming on the semantic categorisation of lip-read and read words.
Experiment 2.
METHOD.
Subjects.
The subjects were 20 unpaid volunteers recruited from the undergraduate population of Macquarie University. There were 14 females and 6 males. All of the subjects had normal hearing and (corrected) vision. They were aged between 18 and 45 years of age. None of the subjects had participated in Experiment 1.
Procedures and materials.
The procedure was similar to that used in Experiment 1. Subjects were presented with a list of words which had to be categorised as plant or animal in a priming phase and a test phase. The test phase consisted in one case of the list being presented graphically on the computer visual display, and in the other case, lip-read on a television monitor (see Experiment 1 for details). In both cases the priming task was auditory. The order of presentation of the graphic test and lip-read test was alternated. The two word lists were alternated across conditions. In the priming task, half the subjects were presented with 10 of the words (5 animal and 5 plant), other subjects were primed with the remaining ten words.
RESULTS
Each subject's data were analysed to provide a mean reaction time for the ten primed and ten unprimed words in each condition (Figure 2). A two factor analysis of variance was conducted on these scores (lip-read versus graphic test: primed versus unprimed). The results showed that the conditions term was significant (F=263.2, df 1,19, p<.001). Post hoc tests showed that graphic reaction times were significantly shorter than lip-read reaction times. The effect of priming was also significant (F=5.77, df 1,19 p=.025). Priming again results in faster reaction times. However, the interaction term was significant (F=5.317, df 1, 19, p= .03) indicating that the condition of presentation (lip-read or graphic) is affected differentially by the auditory priming encounter. Post hoc tests showed that an auditory priming task increased reaction time for a lip-read stimulus but not a graphic stimulus in the test encounter. The mean number of categorisation errors for the lip-read condition was 3.6, and 0.65 for the graphic condition.
Figure 2 Here.
FIGURE 2
DISCUSSION.
Auditory priming can significantly speed the categorisation of subsequently lip-read words (mean increase of 216 msec). This finding is consistent with the results from other paradigms, e.g. serial ordered recall, showing that lip-read and auditory stimuli appear to share a common processing stage. Despite lip-read information being perceived visually, its phonological or dynamic characteristics result in lip-read stimuli being processed as if it had been heard. Auditory priming did not significantly speed the categorisation of graphically presented words (mean increase 10 msecs). This finding is in agreement with those of Monsell (1985) and others, showing that the processing of graphic stimuli is not influenced by prior auditory experience.
The third experiment investigated the effect of a graphic priming task on the semantic categorisation of lip-read and auditory words. The findings of Experiment 2., and previous reviews of research (e.g. Allport ,and Funnell, 1981) suggest that graphic priming should not effect the speed of processing heard words. No previously published reports have measured the effect of graphic priming on the reaction times for lip-read word semantic categorisation. However a pilot experiment (Campbell, in press) indicated that graphic priming improved the accuracy of lip-read word recognition.
Experiment 3
METHOD
Subjects.
The subjects were 20 undergraduates enrolled at Macquarie University; 13 females and 7 males. No subject took part in more than one of the experiments. All subjects were aged between 18 and 45 and had normal hearing, and (corrected) vision.
Procedures and materials.
The procedure was similar to the other experiments although in this case, a graphic stimulus was used in the priming encounter and the test encounter was comprised of a lip-read and an auditory condition. The order of presentation of the test and priming lists was randomised across subjects.
RESULTS.
The data were analysed with a two factor analysis of variance. The conditions term was significant (F=6.1, df 1, 19, p<.025) indicating that reaction times in the lip-read task were significantly longer than those in the auditory task (see Figure 2). The effect of priming was also significant (F=25.6, df 1,19, p<.001.). The interaction term was not significant (F=2.4, df 1, 191 P=.138), indicating that the extent of the priming effect did not differ for the auditory and lip-read tests. The mean number of categorisation errors for the lip-read task was 3.6, and 1.3 for the auditory task.
DISCUSSION
The results indicate that a graphically presented priming task increases the speed of semantic categorisation of both auditory (mean increase 69 msec) and lip-read (mean increase 179 msec) stimuli. The finding that reading words can prime their processing when they are heard is at odds with many previous findings (e.g. Clarke and Morton, 1983) and reviews of the literature (e.g. Allport and Funnell, 1981). However, it is not the first reported case of cross modal priming. Monsell (1985) reported that graphic priming speeded the processing of auditory stimuli in a lexical decision task. He commented that although cross modal effects may be numerically small compared to within modality effects, they should not be dismissed as null-effects. Experiments 2 and 3 indicated that the priming effect was asymmetrical; while a graphic stimulus primes auditory categorisation, the reverse is not true. This pattern replicates Monsell's (1985) findings. Experiment 4 investigated whether a lip-read input, which showed an increased reaction time when primed by auditory and graphic stimuli, could in turn prime semantic categorisation of heard and read words.
Experiment 4.
METHOD.
Subjects.
20 undergraduates, 8 male and 12 female volunteered for the experiment. They were all aged between 18 and 45 and possessed normal hearing and (corrected) vision.
Procedures and materials.
The procedure followed that of the other experiments. In this experiment the priming stimuli were lip-read words and the test stimuli were auditorially and graphically presented words.
RESULTS.
An analysis of variance was performed on the mean reaction time scores of subjects. The conditions term was significant (F=188.3, df 1,19, p<.001) indicating that reaction times were shorter for the graphic test than for the auditory test (See Figure 2). The effect of priming was also significant (F=5.981, df 1,19, p<.025). The interaction term was not significant (F<1). That is, 1ip-read priming equally increases reaction times in an auditory categorisation task and a graphic categorisation task. The mean number of categorisation errors for the graphic task was 0.75, and 1.25 for the auditory task.
DISCUSSION.
Prior lip-read experience of words facilitated their semantic categorisation when they were heard and read. While the interactive relationship between lip-read and heard speech was predicted from the results of other paradigms, the priming of read words by lip-reading was surprising. Pilot experiments had indicated that while graphic priming could improve the identification of lip-read words, prior lip-reading experience did not speed the processing of read words (Campbell, in press). Since lip-reading is a more difficult, and less familiar task than hearing or reading words, it is likely that any prior information about what a lip-read stimulus might be would be used, irrespective of its modality of input. It is more difficult to explain why lip-read experience should enhance the processing of read words. A comparison of the four experiments might clarify the pattern of findings.
Comparison of experiments 1, 2, 3 and 4
Table 1 sets out the mean difference scores (unprimed - primed), and the percent difference expressed ,is a proportion of total reaction time for each condition. A two factor analysis of variance (prime mode, and test mode) using percent difference scores, where each condition was treated as an independent group, was used to compare the four experiments. The priming term was not significant (F<1 ), i.e. the type of priming (auditory, graphic and lip-read) did not influence the extent of the priming effect. Obviously the large analysis treating each condition's scores independently swamped the finding that auditory priming does not effect the semantic categorisation of read words (see Experiment 2). This is hardly surprising as all other conditions show a significant priming effect. The test term was significant (F= 3.7, df 2,65, p<.025), i.e. mode of test stimuli presentation effected the extent of the priming effect. Inspection of Table 1. shows that the priming effect was strongest for lip-read test stimuli. The interaction term did not reach significance.
GENERAL DISCUSSION.
The experiments reported investigated the processing of lip-read words in comparison to that of heard and read words using the repetition priming paradigm. Previous research, using auditory and graphic stimuli, indicates that within modality priming effects are greater than between modality effects (see Allport and Funnell, 1981, for review). This is not true for lip-read stimuli. While lip-read priming significantly increased reaction times in a lip-read test, both auditory and graphic priming also speeded the processing of lip-read words. The comparison across experiments (see Table 1) shows that graphic and auditory priming were at least as effective as same mode priming in the lip-read test conditions.
Lip-reading is a more difficult task than hearing or reading. Lip-read information is partial, and therefore requires the perceiver to "fill in" missing information. Research into the lip-reading skills of normally hearing, and hearing impaired subjects, has shown that the more general information available in a lip-read stimulus (syntactic, semantic, visual cue providing situational context), the greater the accuracy of report (Gregory, Plant and Dodd, in preparation). That is, lip-reading involves more ''top-down'' processing. The fact that reaction times are longer for lip-read stimuli than for heard and read stimuli may be explained by subjects' need to access stored information that aids identification of a perceptually unclear stimulus. Any relevant information, whatever its modality of perception, or form of expression (e.g. graphic), would be likely to be accessed.
Although the finding that auditory priming dramatically speeds the processing of lip-read words is at odds with the modality specific rule for repetition priming effects, it fits with research indicating that lip-read and heard speech share a common stage of processing in short term memory (e.g. Campbell and Dodd, 1980). In normal face to face communication normally hearing people perceive speech bimodally. That the visual (lip-read) component cannot be ignored is clearly demonstrated by the fusion illusion. A heard /ba/ dubbed onto the lip movements for /ga/ results in the perception of /da/ (McGurk and MacDonald, 1976). Despite being perceived by different modalities, lip-read and heard speech share two important characteristics: both are phonological, and consist of changing state (dynamic) information. Their relationship is, therefore, closer than that between auditory and graphic, or lip-read and graphic representations of words.
Written words are static, and can be perceived as a whole at the moment of display. While reading is usually assumed to involve mediation via phonological encoding, such coding is not essential. Studies of brain injured patients show that there are two routes for accessing meaning from print. One follows a phonological route, the other accesses meaning directly without phonological recoding (Coltheart, 1980). Subjects with normal brain function may use either route according to the demands of the task.
In the repetition priming paradigm the differential effects found for auditory and graphic (visual) stimuli may be due to the processing route used rather than the modality of stimulus perception. In a task where speed of response is the known measure, and the decision is one of semantic categorisation, a direct, nonphonological route would be more efficient. This rather speculative hypothesis may partly explain why reaction times are faster in the graphic test conditions, and why auditory priming does not effect the graphic test. Speed of response would be increased by directly accessing the semantic system, avoiding the time consuming phonological coding of words. However, this hypothesis is weakened by the finding that prior lip-read experience of words speeds response in a graphic test.
Why should prior lip-read experience prime graphic semantic categorisation when auditory priming does not? One explanation, in the modality specific tradition, is that both lip-read and read words are perceived visually. However, acceptance of this explanation would generate the need to explain why lip-read and graphic information appear to be processed separately in short term recall tasks (see introduction), and why lip-reading primes audition (Experiment 4), and audition primes lip-reading (Experiment 2). An alternative explanation is that during a lip-read priming task, subjects access stored information from a variety of sources, including graphic representations, in order to identify a partial stimulus.
The experiments reported are an initial exploration of the repetition priming effect using lip-read stimuli. In some respects the findings are predictable. The processing of lip-read and heard stimuli are closely related in a task that taps lexical access. Previous research, showing the unique relationship between these two types of stimuli in serial ordered recall (e.g. Campbell and Dodd, 1980), and attribution of modality (Dodd and Campbell, 1984) paradigms, indicated that cross modal priming effects between lip-read and heard speech were highly likely. The priming interaction between lip-read and read stimuli was unexpected, and difficult to explain. Current experiments are investigating two hypotheses: that normal and degraded auditory and graphic stimuli will show a different pattern of cross modal priming, and that forcing subjects to code graphic stimuli phonologically will give rise to cross modal effects between audition and vision.
Acknowledgments.
We thank the N. H. & M. R. C. for financial assistance.
REFERENCES.
Allport, D.A. and Funnell, E. (1981). Components of the mental lexicon. Philosophical Transactions of the Royal Society of London, B295 397-410.
Campbell, R. (in press). Lip-reading and immediate memory processes. In B. Dodd and R. Campbell (Eds.) Hearing by Eye. London: Erlbaum.
Campbell, R. and Dodd, B. (1980). Hearing by eye. Quarterly Journal of Experimental Psychology. 32, 85-99.
Campbell, R., Dodd, B. and Brasher, J. (1983). The sources of visual recency: movement and language in serial recall. Quarterly Journal of Experimental Psychology 35A, 571-587.
Clarke, R. and Morton, J. (1983). Cross-modality facilitation in tachistoscopic word recognition. Quarterly Journal of Experimental Psychology, 35A, 79-96.
Coltheart, M. (1980). Reading, phonological recoding and deep dyslexia. In M. Coltheart, K. Patterson and J.C. Marshall (Eds.), Deep Dyslexia. London: Routledge and Kegan Paul.
Dodd, B. (1977). The role of vision in the perception of speech. Perception, 6, 31-40.
Dodd, B. and Campbell, R. (1984). Non-modality specific speech coding: the processing of lip-read information. Australian Journal of Psychology 36, 171-179
Gardiner, J.M., Gathercole, S. and Gregg, V.H. (1981). Lip-reading and auditory memory. Paper to Annual Meeting of the Psychonomic Society, Philadelphia.
Gregory, M., Plant, G. and Dodd, B. (in preparation). The lip-reading skills of adults with acquired hearing loss.
McGurk, H. and MacDonald J. (1976). Hearing lips and seeing voices. Nature 265, 746-748.
Monsell, S. (1985). Repetition and the lexicon. in A.W. Ellis (Ed.) Progress in the Psychology of Language vol. 2. London: Erlbaum.
Morton, J. and Holloway, C.M. (1970). Absence of a cross modal 'suffix effect' in short term memory. Quarterly Journal of Experimental Psychology 22, 167-176.
Summerfield, A.Q. (in press). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd and R. Campbell (Eds.), Hearing by Eye. London: Erlbaum.
Thorndike, E.L. and Lorge, I. (1944). The Teachers Handbook of 30,000 Words. New York: Columbia University.
Wood, L.E. (1974). Visual and auditory coding in a memory matching task. Journal of Experimental Psychology, 102, 106-113.