蒸馏
计算机科学
正规化(语言学)
人工智能
训练集
领域(数学分析)
语音识别
机器学习
标记数据
资源(消歧)
培训(气象学)
模式识别(心理学)
自然语言处理
数学
色谱法
计算机网络
化学
物理
气象学
数学分析
作者
Ashish Seth,Sreyan Ghosh,S. Umesh,Dinesh Manocha
标识
DOI:10.1109/icassp48485.2024.10446335
摘要
Continued self-supervised (SSL) pre-training for adapting existing SSL models to the target domain has shown to be extremely effective for low-resource Automatic Speech Recognition (ASR). This paper proposes Stable Distillation, a simple and novel approach for SSL-based continued pre-training that boosts ASR performance in the target domain where both labeled and unlabeled data are limited. Stable Distillation employs self-distillation as regularization for continued pre-training, alleviating the over-fitting issue, a common problem continued pre-training faces when the source and target domains differ. Specifically, first, we perform vanilla continued pre-training on an initial SSL pre-trained model on the target domain ASR dataset and call it the teacher. Next, we take the same initial pre-trained model as a student to perform continued pre-training while enforcing its hidden representations to be close to that of the teacher (via MSE loss). This student is then used for downstream ASR fine-tuning on the target dataset. In practice, Stable Distillation outperforms all our baselines by 0.8 - 7 WER when evaluated in various experimental settings 1 .
科研通智能强力驱动
Strongly Powered by AbleSci AI