计算机科学
互补性(分子生物学)
钥匙(锁)
人工智能
语义相似性
范围(计算机科学)
数据挖掘
相似性(几何)
分层数据库模型
情报检索
理论计算机科学
机制(生物学)
作者
Dongju Yang,Junming Wang
出处
期刊:Applied sciences
[Multidisciplinary Digital Publishing Institute]
日期:2026-01-15
卷期号:16 (2): 903-903
摘要
Traditional retrieval-augmented generation (RAG) methods struggle with hierarchical documents, often causing semantic fragmentation, structural loss, and inefficient retrieval due to fixed strategies. To address these challenges, this paper proposes HiSem-RAG, a hierarchical semantic-driven RAG method. It comprises three key modules: (1) hierarchical semantic indexing, which preserves boundaries and relationships between sections and paragraphs to reconstruct document context; (2) a bidirectional semantic enhancement mechanism that incorporates titles and summaries to facilitate two-way information flow; and (3) a distribution-aware adaptive threshold strategy that dynamically adjusts retrieval scope based on similarity distributions, balancing accuracy with computational efficiency. On the domain-specific EleQA dataset, HiSem-RAG achieves 82.00% accuracy, outperforming HyDE and RAPTOR by 5.04% and 3.98%, respectively, with reduced computational costs. On the LongQA dataset, it attains a ROUGE-L score of 0.599 and a BERT_F1 score of 0.839. Ablation studies confirm the complementarity of these modules, particularly in long-document scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI