Research progress and challenges in real-time semantic segmentation for deep learning

分割 计算机科学 人工智能 尺度空间分割 深度学习 基于分割的对象分类 图像分割 监督学习 模式识别(心理学) 机器学习 计算机视觉 人工神经网络
作者
Zhuo Wang,Shaojun Qu
出处
期刊:Journal of Image and Graphics [University of Portsmouth]
卷期号:29 (5): 1188-1220
标识
DOI:10.11834/jig.230605
摘要

语义分割作为计算机视觉领域的重要研究方向之一,应用十分广泛。其目的是根据预先定义好的类别对输入图像进行像素级别的分类。实时语义分割则在一般语义分割的基础上又增加了对速度的要求,广泛应用于如无人驾驶、医学图像分析、视频监控与航拍图像等领域。其要求分割方法不仅要取得较高的分割精度,且分割速度也要快。随着深度学习和神经网络的快速发展,实时语义分割也取得了一定的研究成果。本文在前人已有工作的基础上对基于深度学习的实时语义分割算法进行系统的归纳总结,包括基于Transformer和剪枝的方法等,全面介绍实时语义分割方法在各领域中的应用。首先介绍实时语义分割的概念,再根据标签的数量和质量,将现有的基于深度学习的实时语义分割方法分为强监督学习、弱监督学习和无监督学习3个类别。在分类的基础上,结合各个类别中最具有代表性的方法,对其优缺点展开分析,并从多个角度进行比较。随后介绍目前实时语义分割常用的数据集和评价指标,并对比分析各算法在各数据集上的实验效果,阐述现阶段实时语义分割的应用场景。最后,讨论了基于深度学习的实时语义分割存在的挑战,并对实时语义分割未来值得研究的方向进行展望,为研究者们解决存在的问题提供便利。;Semantic segmentation is widely used as an important research direction in the field of computer vision,and its purpose is to classify the input image at the pixel level according to predefined categories. Real-time semantic segmentation,as a subfield of semantic segmentation,adds speed requirements to segmentation methods on the basis of general semantic segmentation and is widely used in fields,such as unmanned driving,medical image analysis,video surveillance,and aerial images. The segmentation method should achieve not only high segmentation accuracy but also fast segmentation speed(specifically,the speed of processing images per unit time reaches 30 frames). With the rapid development of deep learning technology and neural networks,real-time semantic segmentation has also achieved certain research results. Majority of previous researchers have discussed semantic segmentation,but review papers on real-time semantic segmentation methods are few. In this paper,we systematically summarize the real-time semantic segmentation algorithms based on deep learning on the basis of the existing work of the previous researchers. We first introduce the concept of realtime semantic segmentation,and then,according to the number and quality of the participating training labels,the existing real-time semantic segmentation methods based on deep learning are categorized into three classes:strongly supervised learning,weakly supervised learning,and unsupervised learning. Strongly supervised learning methods are categorized from three perspectives:improving accuracy,improving speed,and other methods. Accuracy improvement methods are further divided into subcategories according to the network structure and feature fusion methods. According to the network structure,the real-time semantic segmentation methods can be categorized into encoder-decoder structure,two-branch structure,and multibranch structure;the representative networks in the encoder-decoder section are fully convolutional network(FCN)and UNet;the networks with two-branch structure are the BiSeNet series;and the multibranch structure has ICNet and DFANet. According to the different ways of feature fusion,real-time semantic segmentation methods can be categorized into multiscale feature fusion and attention mechanism. According to the different ways of feature sampling in the process of multiscale feature fusion,this study divides multiscale feature fusion into atrous spatial pyramid pooling and ordinary pyramid pooling;the attention mechanism can be further divided into self-attention mechanism,channel attention,and spatial attention according to the computation method of the attention vector. The methods to improve the speed are analyzed and discussed from the perspectives of improving convolutional blocks and lightweight networks;the methods to improve convolutional blocks can be divided into separable convolution(separable convolution can be divided into depth separable convolution and spatial separable convolution),grouped convolution,and atrous convolution. Among other methods of strongly supervised learning,we also specifically add methods of knowledge distillation,Transformer-based methods,and pruning,which are less mentioned in other literatures. Given the numerous methods for real-time semantic segmentation based on strongly supervised learning,we also perform a comparative analysis of the strengths and weaknesses of all the mentioned methods. Real-time semantic segmentation based on weakly supervised learning is classified into methods based on image-level labeling,methods based on point labeling,methods based on object box labeling,and methods based on object underlining labeling. The concept of unsupervised learning is introduced,and the commonly used unsupervised semantic segmentation methods at the present stage are described,including the method with the introduction of the generalized domain adaptation problem and the method with the introduction of unsupervised pre-adaptation task. Subsequently,the datasets and evaluation indexes commonly used in real-time semantic segmentation are introduced. In addition to the street scene dataset commonly used in unmanned counting,this study supplements the medical image dataset. In the evaluation indexes,this study provides a detailed introduction to the accuracy measure and speed measure and then compares the experimental effects of the algorithms on the datasets so far through the table to obtain the latest research progress in the field. The application scenarios of real-time semantic segmentation are further elaborated in detail. Real-time semantic segmentation can be applied to automatic driving,which can segment road scene images in a short time to help identify roads,traffic signs,pedestrians,vehicles,and other objects. By segmenting medical images at the pixel level,real-time semantic segmentation can also help doctors identify and localize lesion areas accurately. In the field of natural disaster monitoring and emergency rescue,real-time semantic segmentation can quickly identify airplanes and aircrafts and can help doctors identify and locate lesion areas accurately. Real-time semantic segmentation can quickly recognize disaster areas in aerial images;real-time segmentation of scenes and objects in surveillance videos can provide accurate and intelligent data for surveillance systems. Then,according to the specific application scenarios of real-time semantic segmentation and the problems encountered at this stage,this study considers that the challenges faced by real-time semantic segmentation include the following:1)mobile segmentation problem,which hardly develops large-scale computation on low-storage devices;2)how to get away from the dependence of efficient networks on hardware devices;3)experimental accuracy of the current real-time semantic segmentation model,which hardly reaches the standard of automatic driving;4)lack of scene data for medical image and 3D point cloud design. Finally,this study gives an outlook on the future directions of realtime semantic segmentation that are worth researching,e. g. ,occlusion segmentation,real-time semantic segmentation of small targets,adaptive learning model,cross-modal joint learning,data-centered real-time semantic segmentation,and small-sample real-time semantic segmentation.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
方南莲完成签到,获得积分10
刚刚
zdw完成签到,获得积分10
1秒前
3秒前
至此完成签到,获得积分10
6秒前
欣欣一人发布了新的文献求助10
7秒前
Sherwin完成签到,获得积分10
8秒前
13秒前
小w完成签到,获得积分10
18秒前
laplatom完成签到,获得积分10
18秒前
一一一发布了新的文献求助10
19秒前
激动的雪冥完成签到,获得积分10
20秒前
小丘2024发布了新的文献求助10
21秒前
不再选择完成签到,获得积分10
22秒前
等待的谷波完成签到 ,获得积分10
25秒前
彭于晏应助Ceceliayyy采纳,获得10
25秒前
LIUHUIHUI发布了新的文献求助10
25秒前
26秒前
天天完成签到,获得积分10
27秒前
28秒前
29秒前
30秒前
体贴的小susu完成签到,获得积分10
30秒前
Ava应助Ceceliayyy采纳,获得10
31秒前
莫等闲完成签到,获得积分10
31秒前
追寻听南发布了新的文献求助20
32秒前
尊敬的惠发布了新的文献求助10
34秒前
35秒前
36秒前
科研通AI5应助北北贝贝采纳,获得10
38秒前
ddd发布了新的文献求助10
39秒前
Lucas应助阿九采纳,获得10
39秒前
40秒前
WSH发布了新的文献求助10
41秒前
41秒前
tomato发布了新的文献求助10
42秒前
虚拟的夜白完成签到,获得积分10
42秒前
Kelvin.Tsi完成签到 ,获得积分10
42秒前
42秒前
43秒前
SciGPT应助hahhahahh采纳,获得10
43秒前
高分求助中
Basic Discrete Mathematics 1000
Technologies supporting mass customization of apparel: A pilot project 600
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3799327
求助须知:如何正确求助?哪些是违规求助? 3344954
关于积分的说明 10322665
捐赠科研通 3061436
什么是DOI,文献DOI怎么找? 1680323
邀请新用户注册赠送积分活动 807007
科研通“疑难数据库(出版商)”最低求助积分说明 763453