按频率列出的单词列表
字幕
词(群论)
多样性(控制论)
语言学
互联网
差异(会计)
心理学
自然语言处理
人工智能
计算机科学
万维网
哲学
业务
会计
判决
作者
Boris New,Marc Brysbaert,Jean Véronis,Christophe Pallier
标识
DOI:10.1017/s014271640707035x
摘要
We examine the use of film subtitles as an approximation of word frequencies in human interactions. Because subtitle files are widely available on the Internet, they may present a fast and easy way to obtain word frequency measures in language registers other than text writing. We compiled a corpus of 52 million French words, coming from a variety of films. Frequency measures based on this corpus compared well to other spoken and written frequency measures, and explained variance in lexical decision times in addition to what is accounted for by the available French written frequency measures.
科研通智能强力驱动
Strongly Powered by AbleSci AI