不平等
钥匙(锁)
计算机科学
过程(计算)
生产力
人工智能
面子(社会学概念)
工作(物理)
社会学
知识管理
语言学
作者
Chenglong Wang,Haoyu Tang,Xiyuan Yang,Yueqi Xie,Yueqi Xie,Jina Suh,Sunayana Sitaram,Junming Huang,Yu Xie,Yu Xie,Pengjun Zhao,Zhaoya Gong,Xing Xie,Fangzhao Wu
标识
DOI:10.1073/pnas.2514626122
摘要
As large language models (LLMs) gradually demonstrate their potential to boost productivity and become integral tools for problem-solving in daily life worldwide, understanding the linguistic inequalities they introduce is becoming increasingly important. Prior research has primarily focused on static analyses of disparities in existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledge to provide current, relevant responses and deliver precise, expert-level answers in specific domains. Investigating linguistic inequalities within this dynamic learning process is, therefore, also essential. In this paper, we explore inequalities in new knowledge learning by LLMs across different languages and four key dimensions: effectiveness, transferability, prioritization, and robustness. Through extensive experiments in both in-context learning and fine-tuning settings, with proprietary and open-source models, we reveal four key findings: 1) LLMs face greater challenges in efficiently and accurately learning new knowledge in lower-resource languages; 2) knowledge learned by LLMs tends to be more easily transferred to higher-resource languages than to lower-resource ones; 3) new knowledge in higher-resource languages is more likely to be retained and prioritized; and 4) LLMs are more robust against incorrect or misleading information in higher-resource languages. We further analyze the underlying causes of these inequalities from linguistic perspectives, pretraining characteristics, and tokenizer design, and propose a preliminary mitigation strategy through the lens of linguistic neurons. This work highlights the urgent need to recognize and address emerging linguistic inequalities in the development of LLMs.
科研通智能强力驱动
Strongly Powered by AbleSci AI