基础(证据)
计算机科学
测距
语言模型
国家(计算机科学)
人工智能
程序设计语言
考古
地理
电信
作者
Hugo Touvron,Thibaut Lavril,Gautier Izacard,Xavier Martinet,Marie-Anne Lachaux,Timothée Lacroix,Baptiste Rozière,Naman Goyal,Eric Hambro,Faisal Azhar,Aurelien Rodriguez,Armand Joulin,Édouard Grave,Guillaume Lample
出处
期刊:Cornell University - arXiv
日期:2023-01-01
被引量:2511
标识
DOI:10.48550/arxiv.2302.13971
摘要
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.
科研通智能强力驱动
Strongly Powered by AbleSci AI