T04
How voice perception affects listening effort
Listening to speech with one or multiple persons talking in the background is known to be challenging and effortful, especially for people with hearing impairment. Being able to perceive differences in voice cues like fundamental frequency (F0) and vocal-tract length (VTL) can help listeners segregate competing talkers, which improves speech understanding. Research showed that cochlear implant (CI) listening reduces sensitivity to F0 and VTL voice cues, potentially contributing to difficulties in understanding speech in adverse listening conditions. The pupil dilation response has shown to be an objective measure for cognitive processing load in adverse listening conditions, also referred to as listening effort. Using a variety of listening tasks, studies have shown different types of speech degradation, by masking (e.g., noise vs. speech) or vocoding, to affect the pupil dilation response. However, it is relatively unknown how voice perception processes, which make use of voice cue information, affect listening effort when speech is degraded.
In a couple of studies, F0 and VTL voice cues were systematically manipulated while we investigated the effect of voice discriminability on listening effort. In one study participants performed a speech-on-speech task (sentences), while in the other study participants performed voice cue discrimination tasks (CV-triplets), where the reference voice was trained or untrained and stimuli were either clear or vocoded. The amount of listening effort during both tasks was investigated by means of pupillometry.
Speech-on-speech listening improved when F0 and/or VTL were manipulated to be different between target and masker speech uttered by the same talker. Improvements in performance co-occurred with smaller pupil dilation responses recorded during listening, indicating a decrease in listening effort. At the level of voice cue discrimination, voice training resulted in a smaller pupil dilation response when listeners had to indicate F0+VTL differences between CV-triplets in vocoded speech. Interestingly, this effect was not present for non-vocoded speech and occurred in the absence of a performance benefit.
The pupil response shows a systematic decrease in listening effort when target and masker voices differ for F0 and VTL voice cues. Additionally, voice training showed to be of benefit by reducing listening effort, when speech was degraded by means of vocoding. These outcomes provide insight on the impact that voice discriminability and voice familiarity have on listening effort in normal and CI-listening.