场景图
计算机科学
关系(数据库)
空间关系
人工智能
图形
对象(语法)
背景(考古学)
自然语言处理
人机交互
作者
Xiaokang Yang,Fei Hu,Long Ye,Zhiming Chang,Jianda Li
出处
期刊:Displays
[Elsevier BV]
日期:2022-02-01
卷期号:: 102168-102168
标识
DOI:10.1016/j.displa.2022.102168
摘要
When describing a complicated scene, natural language usually has extra meanings beyond the actual descriptions based on the context of objects and relations. Thus, synthesizing 3D scene from natural language description can be regarded as an ill-posed problem. To solve this challenge, we build a novel system named Text to Scene (T2S) via relation learning for the language-driven scene synthesis. In our paper, we propose a novel graph-based contextual completion method Contextual ConvE(CConvE) to enrich the 3D indoor scene and visualize the graph by arranging 3D models under an object location protocol. Besides, we integrate them into T2S system which synthesizes 3D scene from text. Given a text, T2S system will organize inclusive semantic message to a graph template, complete the graph with CConvE and visualize the graph by retrieving and arranging 3D models under the protocol. CConvE is a convolutional neural network which can infer object categories and spatial relations from the contextual message passing. The experimental result shows the competitive performance of CConvE compared with the state-of-the-art approach and proves that introducing semantic object relation learning method to 3D scene synthesis task can make the virtual visualized results in line with real life. • The first to introduce link prediction theory to the scene synthesis task. • A novel graph-based relation learning method Contextual ConvE is proposed to complete elements in the scene. • A Text to Scene system with semantic completion is established for cross-modal conversion from text description to 3D indoor scene.
科研通智能强力驱动
Strongly Powered by AbleSci AI