句子完成测试
心理学
自然语言处理
考试(生物学)
判决
人工智能
本我、自我与超我
洛文格的自我发展阶段
语言学
计算机科学
社会心理学
生物
哲学
古生物学
标识
DOI:10.3389/fpsyg.2025.1488102
摘要
The emergence of artificial intelligence and the widespread availability of large language model open the door to text analysis at scale leveraging on complex classification instructions. This case study explores the possibility of using available large language models to measure ego development at scale and establish a methodology that can be applied to other classification instructions. Ego consists of the traits that influence how a person perceives and engages with the world, while ego development is a crucial aspect of adult personality growth, influencing behaviors and decisions in both personal and professional contexts. Accurate assessments of ego development stages are vital for creating effective strategies in organizational psychology and corporate analytics. This case study investigates the agreement between expert and automated classifications of ego development stages, aiming to evaluate the potential of automation in this domain leveraging artificial intelligence and large language models. Cohen's kappa statistic has been used to measure the agreement between classifications made by experts and those generated by an automated process leveraging large language models. The comparison between the scoring of experts and large language models yielded a weighted Kappa value of 0.779, indicating a substantial level of agreement that is statistically meaningful and unlikely to be due to chance. While this suggests valuable scoring that leverages large language models, it also highlights the opportunity for further refinement to closely match expert assessments. We observed low variability in aggregated values, demonstrating that the automated process functions effectively at scale. The robustness of aggregated data is particularly evident when calculating ego development scores for individuals, groups, corporate units, and entire corporations. This capability underscores the utility of the automated system for high-level evaluations and decision-making leveraging on a solid indicator. While the classification system developed in this case study shows promise, targeted enhancements may help to achieve a level of accuracy and reliability that improves alignment with experts' evaluations for single sentences. The methodology developed in this case study appears to be useful to support other evaluations at scale that leverage large language models using other maps of classifications.
科研通智能强力驱动
Strongly Powered by AbleSci AI