缩放比例
双曲空间
欧几里德几何
编码(集合论)
多样性(控制论)
代表(政治)
编码(内存)
理论计算机科学
空格(标点符号)
特征(语言学)
欧几里德距离
自然语言处理
欧几里得空间
计算机科学
钥匙(锁)
源代码
语言模型
语言学
人工智能
纯数学
数学
程序设计语言
哲学
几何学
法学
集合(抽象数据类型)
计算机安全
操作系统
政治学
政治
作者
Weize Chen,Xu Han,Yankai Lin,Kaichen He,Ruobing Xie,Jie Zhou,Zhiyuan Liu,Maosong Sun
标识
DOI:10.1109/taslp.2024.3407575
摘要
In recent years, we have witnessed significant improvements in pre-trained language models (PLM) brought about by the scaling of parameter sizes and data amounts. However, this also brings high computational and storage costs. In this paper, we present a new direction to improve PLMs without scaling parameters and data: adopting a geometric feature space that is more suitable for encoding the intrinsic structured features of text. Although text is generally considered unstructured data, it possesses rich intrinsic structured features that signify syntactic and semantic relationships. Leveraging these structured features is vital for text understanding. Given that structured features are better encoded in hyperbolic spaces than in the Euclidean spaces used by conventional PLMs, we propose that PLMs should operate entirely within hyperbolic spaces. Our experiments demonstrate the superiority of hyperbolic PLMs over Euclidean PLMs across a wide variety of tasks, using the same parameter and data settings. This suggests that altering the geometry of model representation is a promising direction for model enhancement. The code is released at https://github.com/thunlp/hyperbolic_llm
科研通智能强力驱动
Strongly Powered by AbleSci AI