对话
集合(抽象数据类型)
帧(网络)
计算机科学
自然语言处理
词(群论)
语言学
文档
序列(生物学)
人工智能
会话分析
变量(数学)
数学
哲学
电信
数学分析
生物
遗传学
程序设计语言
作者
Bethany Gray,Douglas Biber
标识
DOI:10.1075/ijcl.18.1.08gra
摘要
While lexical bundles research identifies continuous sequences (e.g. the end of the, I don’t know if), researchers have also been interested in discontinuous sequences in which words form a ‘frame’ surrounding a variable slot (e.g. I don’t * to, it is * to). To date, most research has focused on a few intuitively-selected frames, or has begun with frequent continuous sequences and then analyzed those to identify associated frames. Few previous studies have attempted to directly identify the full set of discontinuous sequences in a corpus. In the present study, we work towards that goal, using a corpus-driven approach to identify the set of recurrent four-word continuous and discontinuous patterns in corpora of conversation and academic writing. This direct computational analysis of the corpora reveals a more complete set of frames than alternative approaches, resulting in the documentation of highly frequent frames that have not been identified in previous research.
科研通智能强力驱动
Strongly Powered by AbleSci AI