Hyperrectangle Embedding for Debiased 3D Scene Graph Prediction From RGB Sequences

人工智能计算机科学 RGB颜色模型嵌入模式识别（心理学）计算机视觉图形理论计算机科学

作者

Mingtao Feng,Chan Kit Yan,Zijie Wu,Weisheng Dong,Yaonan Wang,Ajmal Mian

出处

期刊：IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
日期：2025-04-11 卷期号：47 (8): 6410-6426 被引量：38

链接

nih.gov edu.audoi.org

标识

DOI：10.1109/tpami.2025.3560090

摘要

3D scene graph has emerged as a powerful high-level representation of the environment and is regarded as a prerequisite for long-term autonomous robotic operations. A practical research problem here is to predict the 3D scene graph from sequentially captured data. However, existing methods neglect the polysemy of semantic roles that coarse feature vectors are insufficient to represent entities in different relationship semantics. This extremely limits their capability to predict relationships. We propose an approach to tackle the aforementioned challenge by introducing a novel representation, the hyperrectangle embedding, which represents entity using distinctive geometry for more effective scene understanding, rather than learning within vector-based feature with blindly increasing dimensions. By incorporating an entity within two affine-transformed embeddings, each representing either the subject or object and characterized by separate learnable transformations, we achieve the polysemy of semantic roles. The intersections of affine-transformed hyperrectangle embeddings represent the bidirectional relationship between two entities. We identify bias and reliability as two challenges impeding the model learning process. In response to the bias, that arises from long-tailed distributions in the data, we propose a history-guided debiasing strategy that utilizes a confusion history block comprised of previous hyperrectangle embeddings. This strategy mitigates inherent biases by extracting pertinent information and facilitating knowledge transfer from dominant categories to rare ones. To enhance the reliability of predictions, we introduce predictive uncertainty into the 3D scene graph prediction task. We develop a post-hoc reliability enhancement strategy to identify potentially unreliable predictions and subsequently enhance the model's predictive accuracy. Extensive experiments on the 3DSSG dataset show the effectiveness of the proposed method in this challenging task, outperforming existing state-of-the-art.

求助该文献

最长约 10秒，即可获得该文献文件

Hyperrectangle Embedding for Debiased 3D Scene Graph Prediction From RGB Sequences

今日热心研友