P08Session 2 (Friday 13 January 2023, 09:00-11:00)Results of the Clarity (Speech) Enhancement Challenge, round 2
The Clarity Enhancement Challenges (CECs) seek to facilitate development of novel processing techniques for improving the intelligibility of speech in noise for hearing-aid users through a series of signal-processing challenges. Each challenge provides entrants with a set of stimuli for development and testing of their algorithms. Algorithms are permitted to use unlimited processing resources, but must be causal in the sense that output at time t must be independent of input at t+5 ms (i.e., in use, the algorithm could cause a lag of no more than 5 ms). The performances of the algorithms are assessed using objective measures of speech intelligibility and subjective measures conducted with a panel of hearing-impaired listeners. CEC2 featured more complex listening environments than CEC1, with multiple interfering sounds sources (speech, music, household appliance sounds) within a simulated living-room environment at signal-to-noise ratios (SNRs) from -12 to +4 dB. In addition, head rotation towards the target speech was introduced. Target speech came from a new dataset of 10,000 different English sentences; 40 actors speaking 250 sentences each (Graetzer et al., 2022). Algorithm inputs included the input signals from 6 microphones (3 on each behind-the-ear hearing aid) and the current head orientation. The objective assessment was provided by HASPI (Kates & Arehart, 2021), but a parallel series of Clarity Prediction Challenges (CPCs) seeks to improve methods of objective assessment. All 18 entries achieved substantial improvements in HASPI, averaging 0.55 across all systems and SNRs. Improvements were greatest (averaging 0.61) for SNRs between -8 and 0 dB. The best-performing system achieved HASPI scores above 0.9 for all SNRs. Results from the CEC2 subjective tests will be available by the time of the meeting. The next challenge is CPC2, which will begin in the Spring.
References:
- Graetzer, S. Akeroyd M.A., Barker J. Cox T.J., Culling J.F., Naylor, G., Porter E., Viveros-Muñoz, R. (2022) “Dataset of British English speech recordings for psychoacoustics and speech processing research: The clarity speech corpus” Data in Brief, 41, 107961.
- Kates, J.M. and Arehart, K.H. (2021) “The hearing-aid speech perception index (HASPI) version 2” Speech Comm. 131, 35-46.