Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm

计算机科学 跳跃式监视 人工智能 最小边界框 模式识别(心理学) 期望最大化算法 标记数据 目标检测 多边形(计算机图形学) 监督学习 最大化 对象(语法) 探测器 机器学习 图像(数学) 最大似然 数学 电信 数学优化 统计 帧(网络) 人工神经网络
作者
Mengbiao Zhao,Wei Feng,Fei Yin,Xu-Yao Zhang,Cheng‐Lin Liu
出处
期刊:IEEE transactions on image processing [Institute of Electrical and Electronics Engineers]
卷期号:31: 5513-5528 被引量:19
标识
DOI:10.1109/tip.2022.3197987
摘要

Scene text detection is an important and challenging task in computer vision. For detecting arbitrarily-shaped texts, most existing methods require heavy data labeling efforts to produce polygon-level text region labels for supervised training. In order to reduce the cost in data labeling, we study mixed-supervised arbitrarily-shaped text detection by combining various weak supervision forms (e.g., image-level tags, coarse, loose and tight bounding boxes), which are far easier to annotate. Whereas the existing weakly-supervised learning methods (such as multiple instance learning) do not promote full object coverage, to approximate the performance of fully-supervised detection, we propose an Expectation-Maximization (EM) based mixed-supervised learning framework to train scene text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data. The polygon-level labels are treated as latent variables and recovered from the weak labels by the EM algorithm. A new contour-based scene text detector is also proposed to facilitate the use of weak labels in our mixed-supervised learning framework. Extensive experiments on six scene text benchmarks show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to that of fully supervised methods, (2) with 100% strongly annotated data, our method achieves state-of-the-art performance on five scene text benchmarks (CTW1500, Total-Text, ICDAR-ArT, MSRA-TD500, and C-SVT), and competitive results on the ICDAR2015 Dataset. We will make our weakly annotated datasets publicly available.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1234完成签到,获得积分10
刚刚
好好发布了新的文献求助10
刚刚
Tyj发布了新的文献求助10
刚刚
失眠醉易应助安屿采纳,获得20
刚刚
睿籽完成签到,获得积分10
1秒前
邹秋雨完成签到,获得积分20
1秒前
诚心小兔子完成签到,获得积分10
1秒前
苗条的水儿完成签到,获得积分10
2秒前
ding应助愤怒的卓越采纳,获得10
2秒前
2秒前
木木完成签到 ,获得积分10
2秒前
3秒前
3秒前
Grin完成签到,获得积分10
3秒前
222发布了新的文献求助20
4秒前
4秒前
soft发布了新的文献求助10
4秒前
4秒前
汪洋发布了新的文献求助10
5秒前
Wu完成签到,获得积分10
6秒前
6秒前
淡墨完成签到,获得积分10
7秒前
7秒前
7秒前
7秒前
7秒前
8秒前
9秒前
RichieXU完成签到,获得积分10
9秒前
老王完成签到,获得积分10
9秒前
库里强完成签到,获得积分10
9秒前
10秒前
10秒前
www完成签到,获得积分10
10秒前
july发布了新的文献求助10
11秒前
wonderwall发布了新的文献求助10
11秒前
11秒前
Ethan完成签到,获得积分20
11秒前
土豪的冰蓝完成签到,获得积分10
12秒前
sharkboy完成签到,获得积分10
12秒前
高分求助中
Technologies supporting mass customization of apparel: A pilot project 600
武汉作战 石川达三 500
Arthur Ewert: A Life for the Comintern 500
China's Relations With Japan 1945-83: The Role of Liao Chengzhi // Kurt Werner Radtke 500
Two Years in Peking 1965-1966: Book 1: Living and Teaching in Mao's China // Reginald Hunt 500
Understanding Interaction in the Second Language Classroom Context 300
Fractional flow reserve- and intravascular ultrasound-guided strategies for intermediate coronary stenosis and low lesion complexity in patients with or without diabetes: a post hoc analysis of the randomised FLAVOUR trial 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3810013
求助须知:如何正确求助?哪些是违规求助? 3354509
关于积分的说明 10371378
捐赠科研通 3070976
什么是DOI,文献DOI怎么找? 1686693
邀请新用户注册赠送积分活动 811058
科研通“疑难数据库(出版商)”最低求助积分说明 766484