无人机
计算机科学
目视检查
人工智能
计算机视觉
计算机安全
实时计算
工程类
航空学
嵌入式系统
遗传学
生物
作者
Jiucai Liu,Haijiang Li,Chengzhang Chai,Kehong Chen,Dalei Wang
标识
DOI:10.1016/j.aei.2025.103643
摘要
The development of large language models (LLMs) holds the potential to significantly advance automation in the Architecture, Engineering, and Construction (AEC) industry. This paper explores the application of LLM-powered agent systems in drone-based visual inspection , focusing on three core aspects. First, a multi-agent framework is proposed, composed of five specialized sub-agents— Router, PathPlanner, Controller, Perceptioner, and Retriever —that collaboratively handle inspection subtasks such as 3D spatial reasoning, path planning , and UAV control. Second, a novel image-based pipeline is introduced to generate multi-criteria semantic point clouds and abstract them into 3D Scene Graphs (3DSGs), enabling spatial-semantic reasoning aligned with human intent. These 3DSGs act as both knowledge storage and reasoning engines . Third, the system is evaluated through simulations and laboratory experiments, demonstrating its ability to automate inspection workflows and provide a foundation for extensible AI-agent systems. The results highlight the advantages of LLM-based agents in flexible task delegation and high-level decision-making. While current implementations rely on general-purpose LLMs accessed via commercial APIs—introducing some latency and adaptation gaps—these aspects also point to promising directions for future optimization and domain-specific enhancement. Overall, the study presents an early yet promising step toward collaborative human–machine intelligence in infrastructure inspection, where autonomous agents augment human decision-making through interactive, context-aware support.
科研通智能强力驱动
Strongly Powered by AbleSci AI