自动汇总
计算机科学
变压器
编码器
判决
人工智能
自然语言处理
代表(政治)
水准点(测量)
胭脂
情报检索
物理
地理
法学
电压
操作系统
政治
量子力学
政治学
大地测量学
作者
Shihao Yang,Shaoru Zhang,Ming Fang,Fengqin Yang,Shuhua Liu
出处
期刊:Electronics
[MDPI AG]
日期:2022-05-27
卷期号:11 (11): 1706-1706
被引量:11
标识
DOI:10.3390/electronics11111706
摘要
Automatic text summarization is a method used to compress documents while preserving the main idea of the original text, including extractive summarization and abstractive summarization. Extractive text summarization extracts important sentences from the original document to serve as the summary. The document representation method is crucial for the quality of the generated summarization. To effectively represent the document, we propose a hierarchical document representation model Long-Trans-Extr for Extractive Summarization, which uses Longformer as the sentence encoder and Transformer as the document encoder. The advantage of Longformer as sentence encoder is that the model can input long document up to 4096 tokens with adding relative a little calculation. The proposed model Long-Trans-Extr is evaluated on three benchmark datasets: CNN (Cable News Network), DailyMail, and the combined CNN/DailyMail. It achieves 43.78 (Rouge-1) and 39.71 (Rouge-L) on CNN/DailyMail and 33.75 (Rouge-1), 13.11 (Rouge-2), and 30.44 (Rouge-L) on the CNN datasets. They are very competitive results, and furthermore, they show that our model has better performance on long documents, such as the CNN corpus.
科研通智能强力驱动
Strongly Powered by AbleSci AI