Detection and Segmentation of Mouth Region in Stereo Stream Using YOLOv6 and DeepLab v3+ Models for Computer-Aided Speech Diagnosis in Children

计算机科学 分割 人工智能
作者
Agata Sage,Paweł Badura
出处
期刊:Applied sciences [Multidisciplinary Digital Publishing Institute]
卷期号:14 (16): 7146-7146 被引量:1
标识
DOI:10.3390/app14167146
摘要

This paper describes a multistage framework for face image analysis in computer-aided speech diagnosis and therapy. Multimodal data processing frameworks have become a significant factor in supporting speech disorders’ treatment. Synchronous and asynchronous remote speech therapy approaches can use audio and video analysis of articulation to deliver robust indicators of disordered speech. Accurate segmentation of articulators in video frames is a vital step in this agenda. We use a dedicated data acquisition system to capture the stereovision stream during speech therapy examination in children. Our goal is to detect and accurately segment four objects in the mouth area (lips, teeth, tongue, and whole mouth) during relaxed speech and speech therapy exercises. Our database contains 17,913 frames from 76 preschool children. We apply a sequence of procedures employing artificial intelligence. For detection, we train the YOLOv6 (you only look once) model to catch each of the three objects under consideration. Then, we prepare the DeepLab v3+ segmentation model in a semi-supervised training mode. As preparation of reliable expert annotations is exhausting in video labeling, we first train the network using weak labels produced by initial segmentation based on the distance-regularized level set evolution over fuzzified images. Next, we fine-tune the model using a portion of manual ground-truth delineations. Each stage is thoroughly assessed using the independent test subset. The lips are detected almost perfectly (average precision and F1 score of 0.999), whereas the segmentation Dice index exceeds 0.83 in each articulator, with a top result of 0.95 in the whole mouth.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
cdercder应助科研通管家采纳,获得10
9秒前
余味应助科研通管家采纳,获得10
9秒前
科研通AI5应助123采纳,获得10
9秒前
思源应助科研通管家采纳,获得10
9秒前
cdercder应助科研通管家采纳,获得10
9秒前
典雅雅容完成签到,获得积分10
9秒前
黄迪迪完成签到 ,获得积分10
10秒前
能干觅夏完成签到 ,获得积分10
16秒前
BINBIN完成签到 ,获得积分10
19秒前
发文章鸭完成签到 ,获得积分10
21秒前
21秒前
SQL完成签到 ,获得积分10
22秒前
2463841186发布了新的文献求助30
24秒前
海阔天空完成签到,获得积分0
28秒前
31秒前
合适靖儿完成签到 ,获得积分10
33秒前
文章多多发布了新的文献求助10
36秒前
YangYue给YangYue的求助进行了留言
41秒前
文章多多完成签到,获得积分10
42秒前
勤劳小懒虫完成签到 ,获得积分10
42秒前
小二郎应助2463841186采纳,获得30
43秒前
和平港湾完成签到,获得积分10
48秒前
她的城完成签到,获得积分0
53秒前
allia完成签到 ,获得积分10
56秒前
58秒前
58秒前
张可完成签到 ,获得积分10
1分钟前
LiangRen完成签到 ,获得积分10
1分钟前
鸿毛药玖发布了新的文献求助10
1分钟前
液晶屏99完成签到,获得积分10
1分钟前
hi_traffic完成签到,获得积分10
1分钟前
领导范儿应助鸿毛药玖采纳,获得10
1分钟前
小民完成签到 ,获得积分10
1分钟前
开心寄松完成签到,获得积分10
1分钟前
天真的羊青完成签到 ,获得积分10
1分钟前
淘宝叮咚完成签到,获得积分10
1分钟前
忒寒碜完成签到,获得积分10
1分钟前
坦率的枕头完成签到,获得积分10
1分钟前
1分钟前
1分钟前
高分求助中
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320
共融服務學習指南 300
Essentials of Pharmacoeconomics: Health Economics and Outcomes Research 3rd Edition. by Karen Rascati 300
Peking Blues // Liao San 300
Political Ideologies Their Origins and Impact 13 edition 240
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3800980
求助须知:如何正确求助?哪些是违规求助? 3346569
关于积分的说明 10329587
捐赠科研通 3063068
什么是DOI,文献DOI怎么找? 1681341
邀请新用户注册赠送积分活动 807491
科研通“疑难数据库(出版商)”最低求助积分说明 763726