自动汇总
计算机科学
变压器
情报检索
判决
编码器
钥匙(锁)
自然语言处理
语义学(计算机科学)
人工智能
物理
计算机安全
量子力学
电压
程序设计语言
操作系统
作者
Yiming Wang,Jindong Zhang,Zhiyao Yang,Bing Wang,Jingyi Jin,Yitong Liu
标识
DOI:10.1016/j.ipm.2024.103677
摘要
In the field of text summarization, extractive techniques aim to extract key sentences from a document to form a summary. However, traditional methods are not sensitive enough to obtain the core semantics of the text, resulting in summaries that contain complicate comprehension. Recently, topic extraction technology extracts core semantics from text, enabling accurate summaries of the main points of a document. In this paper, we introduce the Topic-Injected Bidirectional Encoder Representations from Transformers (TP-BERT), a novel neural auto-encoder model designed explicitly for extractive summarization. TP-BERT integrates document-related topic words into sentences, improving contextual understanding and more accurately aligning summaries with a document’s main theme, addressing a key shortfall in traditional extractive methods. Another major innovation of TP-BERT is the use of contrastive learning during training. This method enhances summarization efficiency by giving prominence to key sentences and minimizing peripheral information. Additionally, we conducted ablation studies and parameter studies of TP-BERT conducted on the CNN/DailyMail, WikiHow, and XSum datasets. In our two main experiments, the average ROUGE-F1 score improved by 2.69 and 0.45 across the three datasets. In comparison to baseline methods, TP-BERT has demonstrated better performance based on the increase in ROUGE-F1 scores on three datasets. Moreover, the semantic differentiation between sentence representations has also contributed positively to the performance enhancements.
科研通智能强力驱动
Strongly Powered by AbleSci AI