P35Session 1 (Thursday 12 January 2023, 15:30-17:30)The effect of voice training on speech perception and listening effort
Background: In multiple-talker situations, normal hearing listeners can effectively use fundamental frequency (F0) and vocal-tract length (VTL) voice cues to segregate target speech from masker speech. Listening in multiple-talker situations might be especially challenging and effortful with degradations in the speech signal, like for cochlear implant users and through vocoder listening, as sensitivity to F0+VTL are shown to be reduced. Previous studies show that voice exposure by implicit or explicit voice training can improve speech intelligibility in normal hearing listeners. Our aim was to investigate if voice training improves sensitivity to F0+VTL voice cues and examine, by means of pupillometry, if listening effort would be affected. Investigating listening effort would provide information on the cognitive load spent during voice cue discrimination, following voice training. We investigated these goals for non-vocoded and vocoded speech.
Methods: Normal-hearing adult participants listened to an audio book and answered content related questions for 30 minutes as an implicit short-term voice training. Subsequently, voice sensitivity (via just-noticeable-differences, JNDs, for F0+VTL) and listening effort (via pupillometry) were measured in both non-vocoded and vocoder-degraded speech. The JNDs were measured with an adaptive 3 alternative forced choice odd-one-out task, with consonant-vowel (CV) triplets presented as stimuli. Acoustic/linguistically, CV triplets were either fixed (same) or variable (different) across the three items. Pupillometry data were quantified in peak pupil dilation, and by Generalized Additive Mixed Models (GAMMs).
Results: The F0+VTL JNDs were significantly larger for vocoded than non-vocoded conditions, and when variable items were presented compared to fixed items. Contrary to our expectations, voice training did not have a significant effect on voice cue sensitivity. Results from GAMM analysis showed that pupil dilations were significantly larger during voice discrimination while listening to untrained, vocoded speech than listening to trained, vocoded speech. For non-vocoded speech, pupil dilations did not differ between trained and untrained voices.
Conclusions: These findings imply that the short voice exposure by listening to a story seems not sufficient to improve voice cue sensitivity, since no significant F0+VTL differences were measured. However, this specific voice training seemed to provide an improvement in listening effort, as voice discrimination among vocoded voices resulted in a smaller pupil dilation with short-term voice training. In a follow-up study, how implicit or explicit voice training differently affects intelligibility in speech-on-speech situations with different target-to-masker ratios and how listening effort is affected by voice training is being investigated, and the relevant data will be presented.