计算科学与工程
计算机科学
弹性网正则化
人工智能
模式识别(心理学)
网(多面体)
机器学习
数学
几何学
特征选择
作者
Penghai Zhao,Weilan Wang,Guowei Zhang,Yuqi Lu
标识
DOI:10.1007/s00521-021-06512-7
摘要
Binarization, one of the most popular research directions in computer vision, is still facing challenges, especially for the degraded historical Tibetan document images. Quite a few U-Net-based binarization approaches might encounter a particular problem called pseudo-touching which hampers subsequent procedures including text line segmentation, character segmentation, and recognition. To avoid these undesired pseudo-touching strokes and obtain optimal binary images, the present work employs several easy-to-use techniques, such as rescaling the input and output of the attention U-Net. Furthermore, we provide insights into the accelerated construction of the training set and discuss the effects of various configurations. The quantitative experimental results on our dataset show that upsampling the input image by a factor of two during the inference phase can alleviate the pseudo-touching. It achieves an average P-FM of 97.73 which is two percentage points higher than the result of U-Net. The proposed approach can also accept common challenges including non-uniform illumination, stains, noise and delivers finer performance across several metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI