14th Speech in Noise Workshop, 12-13 January 2023, Split, Croatia 14th Speech in Noise Workshop, 12-13 January 2023, Split, Croatia

Effect of head-related transfer function individualisation on spatial release from masking in the median plane

Thibault Vicente, Lorenzo Picinali
Audio Experience Design, Dyson School of Design Engineering, Imperial College London, London, United Kingdom

Daniel González-Toledo, María Cuevas-Rodríguez, Luis Molina-Tanco, Arcadio Reyes-Lecuona
Departamento de Tecnología Electrónica, Universidad de Málaga, Málaga, Spain

To create realistic 3D audio scenes through headphones requires taking into account the influence of the human body on the sounds incoming to the ears, which is known as the head-related transfer function (HRTF). Each individual has a specific HRTF due to different morphological features, such as head size and/or pinnae shape. While it is known that using non-individual HRTFs can affect localisation accuracy and perceived realism, the effects on speech intelligibility, for example in terms of improvements observed when masker/s and target are not co-located (spatial release from masking – SRM), are relatively unexplored.

The goal of the current study is to assess the impact of individual versus non-individual HRTFs on SRM when sources are separated along the median plane. The focus on this plane allows us to investigate the use of spectral monaural cues related to the pinnae shape, assuming that these cues are specific to each individual. Two experiments have been designed, both inspired by studies from the literature. These involve a speech-on-speech paradigm using the coordinate response measures (CRM) corpus, which is made by sentences with the same structure, containing only colour and digits keywords.

In the first experiment one speech masker is simultaneously presented with the target. The target speaker is always different from the masking speaker, but both have the same gender. The target and masker are simulated at 3 different frontal locations: 0°, -50° and +50° of elevation. The second experiment involves two concurrent and co-located speech maskers instead of one. The masking speakers are always different from the target speaker and could be from different genders. Three masker locations are also involved (0°, +45° and -45° of elevation), but the target is always simulated at 0° of elevation. Although past studies from the literature involved only individual HRTFs and native-English speaker participants, the current study also explores the use of a generic HRTF, as well as the effect of native language (i.e. non-native English speaking participants were also recruited).

The results of the first experiment showed no significant effect for native speakers, but an effect of HRTF on SRM for non-native speakers, although the benefit was lower compared to the original study from literature. The second experiment showed similar results as the experiment from the literature (when conditions are the same), an effect of HRTF on SRM but no effect of native language.

Last modified 2023-01-06 23:41:06