计算机科学
人工智能
自然语言处理
语音识别
模式识别(心理学)
摘要
This paper introduces a handwritten text detection model for examination papers, termed YOLO-Handwritten, which mitigates the limitations of current models, such as the difficulties arising from varied writing styles, uneven handwriting forms, and disproportionate distributions of positive and negative samples. The YOLO-Handwritten model integrates deformable convolution (DCNv3) and enhanced aggregate feature fusion (iAFF) into the Backbone network to establish a feature improvement mechanism, based on the attributes of handwritten text images in examination papers. This enables the model to adaptively modify sampling locations to address the inconsistencies of handwritten strokes and enhances the representation of critical features via channel attention, thus facilitating the extraction of information across diverse handwriting styles. Secondly, a Biformer attention mechanism is incorporated into the Neck, employing a Bi- Level Routing mechanism that adaptively picks distinct attention pathways according to dense and sparse sections in the feature map, improving the model's capacity to manage both sparse and dense text areas. Finally, VFLoss is presented as the loss function to rectify the imbalance between positive and negative data. In comparison to the baseline model, experimental findings on a self-constructed dataset indicate that this algorithm enhances precision (P), recall (R), and mAP@0.5 by 4.3%, 3%, and 4.6%, respectively, presenting an innovative method for handwritten text detection in examination papers.
科研通智能强力驱动
Strongly Powered by AbleSci AI