Leveraging on large language model to classify sentences: a case study applying STAGES scoring methodology for sentence completion test on ego development

句子完成测试心理学自然语言处理考试（生物学）判决人工智能本我、自我与超我洛文格的自我发展阶段语言学计算机科学社会心理学生物哲学古生物学

作者

Xavier Bronlet

出处

期刊：Frontiers in Psychology [Frontiers Media]
日期：2025-02-06 卷期号：16

链接

doi.org nih.govdoi.org

标识

DOI：10.3389/fpsyg.2025.1488102

摘要

The emergence of artificial intelligence and the widespread availability of large language model open the door to text analysis at scale leveraging on complex classification instructions. This case study explores the possibility of using available large language models to measure ego development at scale and establish a methodology that can be applied to other classification instructions. Ego consists of the traits that influence how a person perceives and engages with the world, while ego development is a crucial aspect of adult personality growth, influencing behaviors and decisions in both personal and professional contexts. Accurate assessments of ego development stages are vital for creating effective strategies in organizational psychology and corporate analytics. This case study investigates the agreement between expert and automated classifications of ego development stages, aiming to evaluate the potential of automation in this domain leveraging artificial intelligence and large language models. Cohen's kappa statistic has been used to measure the agreement between classifications made by experts and those generated by an automated process leveraging large language models. The comparison between the scoring of experts and large language models yielded a weighted Kappa value of 0.779, indicating a substantial level of agreement that is statistically meaningful and unlikely to be due to chance. While this suggests valuable scoring that leverages large language models, it also highlights the opportunity for further refinement to closely match expert assessments. We observed low variability in aggregated values, demonstrating that the automated process functions effectively at scale. The robustness of aggregated data is particularly evident when calculating ego development scores for individuals, groups, corporate units, and entire corporations. This capability underscores the utility of the automated system for high-level evaluations and decision-making leveraging on a solid indicator. While the classification system developed in this case study shows promise, targeted enhancements may help to achieve a level of accuracy and reliability that improves alignment with experts' evaluations for single sentences. The methodology developed in this case study appears to be useful to support other evaluations at scale that leverage large language models using other maps of classifications.

求助该文献

最长约 10秒，即可获得该文献文件

Leveraging on large language model to classify sentences: a case study applying STAGES scoring methodology for sentence completion test on ego development

今日热心研友