Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

10 Citations (Scopus)

Abstract

Speech enhancement (SE) is proved effective in reducing noise from noisy speech signals for downstream automatic speech recognition (ASR), where multi-task learning strategy is employed to jointly optimize these two tasks. However, the enhanced speech learned by SE objective may not always yield good ASR results. From the optimization view, there sometimes exists interference between the gradients of SE and ASR tasks, which could hinder the multi-task learning and finally lead to sub-optimal ASR performance. In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude. Specifically, we first project the SE task's gradient onto a dynamic surface that is at acute angle to ASR gradient, in order to remove the conflict between them and assist in ASR optimization. Furthermore, we adaptively rescale the magnitude of two gradients to prevent the dominant ASR task from being misled by SE gradient. Experimental results show that the proposed approach well resolves the gradient interference and achieves relative word error rate (WER) reductions of 9.3% and 11.1% over multi-task learning baseline, on RATS and CHiME-4 datasets, respectively. Our code is available at GitHub.
Original languageEnglish
Title of host publicationICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherIEEE Explore
Number of pages5
ISBN (Electronic)978-1-7281-6327-7
ISBN (Print)978-1-7281-6328-4
DOIs
Publication statusPublished - 4 Jun 2023
Event2023 IEEE International Conference on Acoustics, Speech and Signal Processing: 48th ICASSP - Rodos Palace Luxury Convention Resort, Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023
Conference number: 48th
https://2023.ieeeicassp.org/

Conference

Conference2023 IEEE International Conference on Acoustics, Speech and Signal Processing
Country/TerritoryGreece
CityRhodes Island
Period4/06/2310/06/23
Internet address

Bibliographical note

This research is supported by National Research Foundation Singapore under its AI Singapore Programme (Award Number: AISG2-100E-2022-10).

Keywords

  • Gradient remedy
  • Multi-task learning
  • speech enhancement
  • noise-robust speech recognition
  • gradient interference

Fingerprint

Dive into the research topics of 'Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition'. Together they form a unique fingerprint.

Cite this