人工智能
计算机视觉
计算机科学
机器视觉
机器人
移动机器人
机器人视觉
作者
Hu Cheng,Yingying Wang,Max Q.‐H. Meng
标识
DOI:10.1109/jsen.2022.3163730
摘要
Grasping is critical for intelligent robots to accomplish sophisticated tasks. Even with multimodal sensor fusion, accurately and reliably estimating grasp poses for complex-shaped objects remains a challenge. In this paper, we design a vision-based grasping platform for a more general case, that is, grasping a variety of objects by a simple parallel gripper with the grasp detection model consuming RGB sensing or depth sensing. Focusing on the grasp pose estimation part, we propose a deep grasp detector that uses a densely connected Feature Pyramid Network (FPN) feature extractor and multiple two-stage detection units to achieve dense grasp pose predictions. Specifically, for the feature extractor, the fusion of different layer feature maps can increase both the model's capacity to detect the various size grasp areas and the accuracy of the regressed grasp positions. For each of the two-stage detection unit, the first stage generates horizontal candidate grasp areas, while the second stage refines them to predict the rotated grasp poses. We train and validate our grasp pose estimation algorithm on the Cornell Grasp Dataset and the Jacquard Dataset. The model achieves the detection accuracy of 93.3% and 89.6%, respectively. We further design real-world grasp experiments to verify the effectiveness of our vision-based robotic grasping system. The real scenario trials validate that the system is capable of grasping unseen objects, in particular, achieving robust and accurate grasp pose detection and gripper opening width measurement based on depth sensing only.
科研通智能强力驱动
Strongly Powered by AbleSci AI