计算机科学
隐藏字幕
计算机视觉
人工智能
编码器
对象(语法)
职位(财务)
可视化
目标检测
图像(数学)
模式识别(心理学)
财务
操作系统
经济
作者
Yaochen Li,Chuan Wu,Ling Li,Yuehu Liu,Jihua Zhu
出处
期刊:IEEE Transactions on Intelligent Transportation Systems
[Institute of Electrical and Electronics Engineers]
日期:2022-07-01
卷期号:23 (7): 7805-7816
被引量:5
标识
DOI:10.1109/tits.2021.3072970
摘要
In this traffic-scene-modeling study, we propose an image-captioning network which incorporates element attention into an encoder-decoder mechanism to generate more reasonable scene captions. A visual-relationship-detecting network is also developed to detect the relative positions of object pairs. Firstly, the traffic scene elements are detected and segmented according to their clustered locations. Then, the image-captioning network is applied to generate the corresponding description of each traffic scene element. The visual-relationship-detecting network is utilized to detect the position relations of all object pairs in the subregion. The static and dynamic traffic elements are appropriately selected and organized to construct a 3D model according to the captions and the position relations. The reconstructed 3D traffic scenes can be utilized for the offline test of unmanned vehicles. The evaluations and comparisons based on the TSD-max, KITTI and Microsoft’s COCO datasets demonstrate the effectiveness of the proposed framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI