可读性
人工智能
计算机科学
自然语言处理
机器学习
凝聚力(化学)
相关性
清晰
回归
正规化(语言学)
回归分析
统计
数学
有机化学
化学
程序设计语言
几何学
生物化学
作者
Andrey Laputenko,Tatiana Evtushenko,Elena Klochkova,Nina Yevtushenko
标识
DOI:10.1109/edm58354.2023.10225121
摘要
The paper addresses the problem of assessing media text readability depending on the correlation of objective and subjective text complexity. Objective text complexity refers to a set of measurable characteristics such as baseline, morphological, syntactic and lexical characteristics of a media text (a piece of news published on university websites), which is calculated automatically for each text. In total, 34 parameters are considered. Subjective text complexity score is assigned to each text by human experts who assess the clarity, structure, cohesion and coherence of the text. The correlation between subjective and objective text complexity is studied using machine learning models. The readability level is measured using machine learning methods and regression-correlation analysis. Namely, an artificial neural network and regression models are used. It is demonstrated that the use of polynomial features and lasso regularization allows to get a compact regression model of a high quality. The use of the chosen machine learning techniques made it possible to estimate the impact of distinct linguistic features on the readability level.
科研通智能强力驱动
Strongly Powered by AbleSci AI