Online Handwritten Chinese Character Recognition Based on 1-D Convolution and Two-Streams Transformers

计算机科学人工智能变压器模式识别（心理学）预处理器手写体识别特征提取计算机视觉语音识别工程类电压电气工程

作者

Y J Chen,Hao Zheng,Yanchun Li,Wanli Ouyang,Jiang Zhu

出处

期刊：IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
日期：2023-12-05 卷期号：26: 5769-5781 被引量：5

标识

DOI：10.1109/tmm.2023.3339589

摘要

As one of the classic problems of pattern recognition, the online Handwritten Chinese Character Recognition (OLHCCR) has attracted the attention of many researchers. Yet, it remains challenging due to complex glyphs, numerous strokes, and huge categories. Existing methods utilize temporal features or spatial features to recognize handwritten characters, which results in recognition errors due to the character with non-standard stroke order. This paper proposes a new OLHCCR model based on 1-D Convolution and Two-Streams Transformers. The model has a 1-D Transformer and a Vision Transformer, and the 1-D Transformer contains a 1-D Convolution layer and Transformers, that is, the model has overall structure of Two-Streams Transformers with 1-D Convolution. So, the model is named as C-TST. It can fuse temporal and spatial features of Chinese character to achieve high recognition accuracy and fast recognition speed. Specifically, each online handwritten Chinese character is represented by a trajectory sequence. The original trajectory sequence is preprocessed to enhance the information density of each trajectory point and features difference among trajectory points. Then, the result after preprocessing is input into the 1-D convolution layer to obtain shallow temporal features, which are used also as the input of the Transformers to capture the temporal features. Simultaneously, character image is generated by processing the original trajectory sequence, and then fed into the Vision Transformer to capture the spatial features. By fusing the captured temporal and spatial features of online handwritten Chinese character, the proposed C-TST achieves a recognition accuracy of 97.90% on ICDAR-2013 and a state-of-the-art recognition accuracy of 97.38% on IAHCC-UCAS2016. The code will be available at https://github.com/cwnuiot/Two-Streams-Transformer

求助该文献

Online Handwritten Chinese Character Recognition Based on 1-D Convolution and Two-Streams Transformers

今日热心研友