SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

计算机科学 人工智能 编码器 增采样 特征(语言学) 分割 卷积神经网络 模式识别(心理学) 联营 水准点(测量) 网络体系结构 深度学习 像素 计算机视觉 图像(数学) 哲学 操作系统 语言学 计算机安全 地理 大地测量学
作者
Vijay Badrinarayanan,Alex Kendall,Roberto Cipolla
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [Institute of Electrical and Electronics Engineers]
卷期号:39 (12): 2481-2495 被引量:1520
标识
DOI:10.1109/tpami.2016.2644615
摘要

We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1] . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3] , DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
曹超国发布了新的文献求助10
刚刚
刚刚
情怀应助111采纳,获得10
1秒前
lilei完成签到,获得积分10
3秒前
3秒前
顾矜应助22233采纳,获得10
3秒前
3秒前
香蕉觅云应助rrrrrr采纳,获得10
4秒前
4秒前
柠檬完成签到,获得积分10
4秒前
4秒前
5秒前
君不见完成签到,获得积分10
5秒前
6秒前
嘎嘎嘎嘎完成签到,获得积分10
7秒前
7秒前
充电宝应助鱼鱼鱼KYSL采纳,获得10
8秒前
无花果应助魁梧的仰采纳,获得10
8秒前
xxfsx应助jxm采纳,获得10
8秒前
two发布了新的文献求助15
8秒前
小松奈奈发布了新的文献求助10
8秒前
wb发布了新的文献求助10
9秒前
9秒前
wanci应助yyyyy采纳,获得10
9秒前
123cvh关注了科研通微信公众号
10秒前
husi发布了新的文献求助10
10秒前
10秒前
舒适的素发布了新的文献求助10
11秒前
星空发布了新的文献求助10
11秒前
量子星尘发布了新的文献求助10
11秒前
wanglu发布了新的文献求助10
11秒前
11秒前
11秒前
12秒前
chutong12345发布了新的文献求助10
12秒前
mark发布了新的文献求助10
12秒前
12秒前
13秒前
LH发布了新的文献求助10
13秒前
13秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Predation in the Hymenoptera: An Evolutionary Perspective 1800
List of 1,091 Public Pension Profiles by Region 1561
Binary Alloy Phase Diagrams, 2nd Edition 1400
Specialist Periodical Reports - Organometallic Chemistry Organometallic Chemistry: Volume 46 1000
Holistic Discourse Analysis 600
Beyond the sentence: discourse and sentential form / edited by Jessica R. Wirth 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5512952
求助须知:如何正确求助?哪些是违规求助? 4607328
关于积分的说明 14504621
捐赠科研通 4542888
什么是DOI,文献DOI怎么找? 2489221
邀请新用户注册赠送积分活动 1471256
关于科研通互助平台的介绍 1443264