形势意识
计算机科学
遥感
人工智能
计算机视觉
人机交互
航空学
工程类
地理
航空航天工程
作者
Leon Seidel,Simon Gehringer,Tobias Raczok,Sven-Nicolas Ivens,Bernd Eckardt,Martin Maerz
出处
期刊:Drones
[Multidisciplinary Digital Publishing Institute]
日期:2025-05-03
卷期号:9 (5): 347-347
标识
DOI:10.3390/drones9050347
摘要
Early wildfire detection is critical for effective suppression efforts, necessitating rapid alerts and precise localization. While computer vision techniques offer reliable fire detection, they often lack contextual understanding. This paper addresses this limitation by utilizing Vision Language Models (VLMs) to generate structured scene descriptions from Unmanned Aerial Vehicle (UAV) imagery. UAV-based remote sensing provides diverse perspectives for potential wildfires, and state-of-the-art VLMs enable rapid and detailed scene characterization. We evaluated both cloud-based (OpenAI, Google DeepMind) and open-weight, locally deployed VLMs on a novel evaluation dataset specifically curated for understanding forest fire scenes. Our results demonstrate that relatively compact, fine-tuned VLMs can provide rich contextual information, including forest type, fire state, and fire type. Specifically, our best-performing model, ForestFireVLM-7B (fine-tuned from Qwen2-5-VL-7B), achieved a 76.6% average accuracy across all categories, surpassing the strongest closed-weight baseline (Gemini 2.0 Pro at 65.5%). Furthermore, zero-shot evaluation on the publicly available FIgLib dataset demonstrated state-of-the-art smoke detection accuracy using VLMs. Our findings highlight the potential of fine-tuned, open-weight VLMs for enhanced wildfire situational awareness via detailed scene interpretation.
科研通智能强力驱动
Strongly Powered by AbleSci AI