计算机科学
分割
k-最近邻算法
点云
云计算
变压器
块(置换群论)
数据挖掘
模式识别(心理学)
机器学习
人工智能
数学
工程类
操作系统
几何学
电压
电气工程
作者
Ziyin Zeng,Huan Qiu,Jian Zhou,Zhen Dong,Jinsheng Xiao,Bijun Li
标识
DOI:10.1109/tgrs.2024.3407761
摘要
Given the prominence of 3D sensors in recent years, 3D point clouds are worthy to be further investigated for environment perception and scene understanding. Learning accurate local and global contexts in point clouds is pivotal for semantic segmentation, and neighbor aggregation and Transformers have achieved notable success in local and global perception in point cloud analysis, respectively. Nevertheless, studying each independently is far from the optimal solution for comprehensive feature learning. To address this, we take a novel step towards investigating and integrating the structures of neighbor aggregation and Transformers. In this paper, we introduce Point Neighbor Aggregation with Transformer (PointNAT), a conceptually straightforward and effective approach aiming to enhance the performance of 3D point cloud semantic segmentation. PointNAT consists of a Neighbor Aggregation Block (NAB) for local perception, a Point Transformer Block (PTB) for global modeling, and a Hybrid Block to connect NABs and PTBs. NABs effectively learn complex local features at varying scales through an improved neighbor aggregation operation and a multi-head mechanism. PTBs efficiently perform global attention using a small set of learnable key points. Hybrid Blocks serve as high-and-low frequency signal hybridizers, merging the strengths of these two blocks by adaptively assigning hybrid weights to local and global contexts. We have evaluated the performance of PointNAT with state-of-the-art networks on several benchmarks, including S3DIS, Toronto3D, and SensatUrban. PointNAT achieves mIoU scores of 77.8%, 84.7%, and 65.2% in these three dataset, respectively. Furthermore, it outperforms the baseline approach PointNeXt by 3.0%, 1.3%, and 4.2%, respectively, while utilizing only 59.9% of the parameters and 15.2% of the FLOPs. The results demonstrate PointNAT's superior ability in accurately segmenting large-scale 3D point cloud scenes, emphasizing its potential to advance environment perception and scene understanding. Our code is available at https://github.com/zeng-ziyin/PointNAT.
科研通智能强力驱动
Strongly Powered by AbleSci AI