传感器融合
融合
打滑(空气动力学)
计算机科学
人工智能
变压器
计算机视觉
材料科学
工程类
电气工程
电压
哲学
语言学
航空航天工程
作者
Mingyu Shangguan,Yang Li
标识
DOI:10.1109/icicm59499.2023.10365811
摘要
This paper presents an approach to enhance the stability of manipulator grasping tasks using a Swin Transformer V2 network model. The focus is on fusing GelSight Sensor single-mode vision, tactile data, and multi-modal information to improve the manipulator's perception in complex environments. The Swin Transformer V2 model is introduced for its strong performance in image understanding. The paper explains how unimodal visual-tactile data are input to the network for feature extraction, followed by a fusion strategy to effectively combine different modalities. The proposed method is applied to a manipulator grasp-slide detection task, resulting in improved stability and accuracy by leveraging multi-modal perception of the environment. Experimental validation and comparisons demonstrate the superiority of the approach, showcasing its potential in enhancing grasping stability.
科研通智能强力驱动
Strongly Powered by AbleSci AI