计算机科学
杠杆(统计)
人工智能
模式识别(心理学)
语义特征
特征(语言学)
特征提取
自然语言处理
骨料(复合)
语音识别
机器学习
语言学
哲学
复合材料
材料科学
作者
Shilian Wu,Yongrui Li,Zengfu Wang
标识
DOI:10.1109/icme55011.2023.00141
摘要
Offline handwritten Chinese text recognition (HCTR) models based on connectionist temporal classification (CTC) have recently achieved impressive results. Due to the conditional independence assumption and per-frame prediction characteristics, CTC-based models cannot capture semantic relationships between output tokens and leverage global visual features of characters. To solve these issues, we propose a Cross-Modality knowledge distillation approach that leverages pre-trained LM (BERT) to transfer contextual semantic information, and then design a feature aggregation module to dynamically aggregate local and global features. Experimental results on the HCTR datasets (CASIA-HWDB, ICDAR2013, HCCDOC) show that our proposed method can significantly improve the model’s performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI