Sandhya, V., Vinay., & Manchaiah, V.
American Journal of Audiology, In Press.
Publication year: 2021

Purpose: Multimodal sensory integration in audiovisual speech perception is a naturally occurring phenomenon. Modality-specific responses such as auditory left, auditory right and visual responses to dichotic incongruent audiovisual speech stimuli help in understanding of audiovisual speech processing through each input modality. It is observed that distribution of activity in the frontal motor areas involved in speech production has been shown to correlate with how subjects perceive the same syllable differently or perceive different syllables. The present study investigated the distribution of modality-specific responses to dichotic incongruent audiovisual speech stimuli, by simultaneously presenting consonant-vowel (CV) syllables with different places of articulation to the participant’s left and right ears and visually.

Design: A dichotic experimental design was adopted. Six stop CV syllables /pa/, /ta/, /ka/, /ba/, /da/ and /ga/ were assembled to create dichotic incongruent audiovisual speech material. Participants included forty native speakers of Norwegian (20 females, mean age = 22.6 years, SD =2.43 years; 20 males, mean age = 23.7 years, SD = 2.08 years).

Results: Findings of the present study showed that under dichotic listening conditions velar CV syllables resulted in the highest scores in the respective ears, and this might be explained by stimulus dominance of velar consonants, as shown in previous studies. However, the present study, with dichotic auditory stimuli accompanied by an incongruent video segment, demonstrated that the presentation of a visually distinct video segment possibly draws attention to the video segment in some participants, thereby reducing the overall recognition of the dominant syllable. Furthermore, the findings here suggest the possibility of lesser response times to incongruent audiovisual stimuli in females compared to males.

Conclusion: Identification of left audio, right audio and visual segments in dichotic incongruent audio-visual stimuli depends on place of articulation, stimulus dominance and voice onset time of the CV syllables.