笑声
隐马尔可夫模型
语音识别
计算机科学
概率逻辑
电话
沉默
分割
区间(图论)
非语言交际
统计模型
自然语言处理
人工智能
沟通
心理学
数学
美学
组合数学
哲学
神经科学
语言学
作者
Hugues Salamin,Anna Polychroniou,Alessandro Vinciarelli
摘要
This article presents experiments on automatic detection of laughter and fillers, two of the most important nonverbal behavioral cues observed in spoken conversations. The proposed approach is fully automatic and segments audio recordings captured with mobile phones into four types of interval: laughter, filler, speech and silence. The segmentation methods rely not only on probabilistic sequential models (in particular Hidden Markov Models), but also on Statistical Language Models aimed at estimating the a-priori probability of observing a given sequence of the four classes above. The experiments are speaker independent and performed over a total of 8 hours and 25 minutes of data (120 people in total). The results show that F 1 scores up to 0.64 for laughter and 0.58 for fillers can be achieved.
科研通智能强力驱动
Strongly Powered by AbleSci AI