已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Child-adult speech diarization in naturalistic conditions of preschool classrooms using room-independent ResNet model and automatic speech recognition-based re-segmentation

说话人日记 分割 背景(考古学) 计算机科学 语音识别 心理学 自然语言处理 人工智能 说话人识别 生物 古生物学
作者
Prasanna V. Kothalkar,John H. L. Hansen,Dwight Irvin,Jay Buzhardt
出处
期刊:Journal of the Acoustical Society of America [Acoustical Society of America]
卷期号:155 (2): 1198-1215 被引量:1
标识
DOI:10.1121/10.0024353
摘要

Speech and language development are early indicators of overall analytical and learning ability in children. The preschool classroom is a rich language environment for monitoring and ensuring growth in young children by measuring their vocal interactions with teachers and classmates. Early childhood researchers are naturally interested in analyzing naturalistic vs controlled lab recordings to measure both quality and quantity of such interactions. Unfortunately, present-day speech technologies are not capable of addressing the wide dynamic scenario of early childhood classroom settings. Due to the diversity of acoustic events/conditions in such daylong audio streams, automated speaker diarization technology would need to be advanced to address this challenging domain for segmenting audio as well as information extraction. This study investigates alternate deep learning-based lightweight, knowledge-distilled, diarization solutions for segmenting classroom interactions of 3–5 years old children with teachers. In this context, the focus on speech-type diarization which classifies speech segments as being either from adults or children partitioned across multiple classrooms. Our lightest CNN model achieves a best F1-score of ∼76.0% on data from two classrooms, based on dev and test sets of each classroom. It is utilized with automatic speech recognition-based re-segmentation modules to perform child-adult diarization. Additionally, F1-scores are obtained for individual segments with corresponding speaker tags (e.g., adult vs child), which provide knowledge for educators on child engagement through naturalistic communications. The study demonstrates the prospects of addressing educational assessment needs through communication audio stream analysis, while maintaining both security and privacy of all children and adults. The resulting child communication metrics have been used for broad-based feedback for teachers with the help of visualizations.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
蓝色天空发布了新的文献求助10
2秒前
3秒前
bbbbuuuoo完成签到,获得积分20
3秒前
3秒前
5秒前
悦耳连碧完成签到 ,获得积分10
8秒前
Criminology34应助竹叶青采纳,获得50
9秒前
11秒前
11秒前
王金霞发布了新的文献求助10
11秒前
悦耳连碧关注了科研通微信公众号
11秒前
江宜完成签到 ,获得积分10
12秒前
13秒前
品品完成签到,获得积分10
13秒前
裘香芦完成签到,获得积分20
13秒前
871004188完成签到,获得积分10
13秒前
14秒前
14秒前
Lauren发布了新的文献求助10
14秒前
咕咕咕完成签到,获得积分10
14秒前
乐悠发布了新的文献求助10
17秒前
小刘医生发布了新的文献求助10
17秒前
云槿发布了新的文献求助10
17秒前
NexusExplorer应助jy采纳,获得10
18秒前
JX发布了新的文献求助50
21秒前
Affenyi发布了新的文献求助10
22秒前
23秒前
壮观乘云发布了新的文献求助10
23秒前
Liekkas发布了新的文献求助50
25秒前
26秒前
搜集达人应助王金霞采纳,获得10
26秒前
27秒前
喵典娜发布了新的文献求助10
28秒前
29秒前
文静修杰发布了新的文献求助10
33秒前
七安发布了新的文献求助10
33秒前
34秒前
淡淡一德完成签到 ,获得积分10
34秒前
35秒前
Crayon完成签到,获得积分10
35秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
HIGH DYNAMIC RANGE CMOS IMAGE SENSORS FOR LOW LIGHT APPLICATIONS 1500
Bandwidth Choice for Bias Estimators in Dynamic Nonlinear Panel Models 1000
Constitutional and Administrative Law 1000
The Social Work Ethics Casebook: Cases and Commentary (revised 2nd ed.). Frederic G. Reamer 800
Holistic Discourse Analysis 600
Vertébrés continentaux du Crétacé supérieur de Provence (Sud-Est de la France) 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5355792
求助须知:如何正确求助?哪些是违规求助? 4487641
关于积分的说明 13970761
捐赠科研通 4388399
什么是DOI,文献DOI怎么找? 2411058
邀请新用户注册赠送积分活动 1403632
关于科研通互助平台的介绍 1377189