LLM-driven causal chain extraction: An interpretable framework for autonomous vehicle crash narrative analysis

撞车计算机科学叙述的毒物控制工程类因果链链条（单位）机动车碰撞车辆安全人工智能计算机安全人为因素与人体工程学风险分析（工程）自动化事故分析职业安全与健康

作者

Hang Su,Jiaming Cao,Zhuoya Li,Sai Tian,Yuanchang Deng

出处

期刊：Traffic Injury Prevention [Taylor & Francis]
日期：2026-05-13 卷期号：: 1-10

链接

nih.govdoi.org

标识

DOI：10.1080/15389588.2026.2655345

摘要

OBJECTIVE: This study aims to establish an interpretable framework for analyzing root causes of autonomous vehicle (AV) crashes by leveraging unstructured crash narratives. It addresses critical gaps in existing research, including fragmented causal attribution and limited utilization of textual data for mechanistic insights. METHOD: We propose an integrated framework that combines Large Language Model (LLM) and Chain-of-Thought (CoT) reasoning to analyze the causal mechanisms of AV crashes using original crash narratives. First, this study employs a sentence-level resampling method to oversample the labeled data. Second, the instruction-tuned LLM is used to extract structured Crash Causality Frames (CCFs), quintuple encoding Movement, Impact, Damage, Effect and Location, from 931 California DMV crash reports. Then, a system-theoretic taxonomy maps CCF elements to 64 causal indicators across five domains. Finally, CoT reasoning generates stepwise natural-language explanations to enhance interpretability. RESULTS: The optimized LLaMA-70B + LoRA model achieved 86.64% Accuracy in CCF extraction, while Data_sCR resampling further improved metrics to 97.93%. Analysis revealed five dominant causation patterns: Pattern 1 (30.5%, pure CV anomalies), Pattern 2 (51.9%, AV-CV interaction failures), and Patterns 3-5 (17.7%, integrating human/environment/, and infrastructure factors). Critical cross-domain couplings were identified (A1 and B2), with rear-end collisions (82.06%) predominating in Pattern 2 scenarios. Moreover, the CoT module generates auditable, step-by-step causal chains to enhance interpretability. Under a practical balance between reliability and computational cost, the accuracy of the generated CoT causal chains reaches 91.04%. CONCLUSION: V2X, (2) Developing context-aware sensor fusion for adverse environments, and (3) Implementing standardized tester training protocols for takeover scenarios.

求助该文献

最长约 10秒，即可获得该文献文件

LLM-driven causal chain extraction: An interpretable framework for autonomous vehicle crash narrative analysis

今日热心研友