计算机科学
稳健性(进化)
语音识别
人机交互
化学
生物化学
基因
作者
Penghao Dong,Yuanqing Song,Shangyouqiao Yu,Zimeng Zhang,Sandeep K. Mallipattu,Petar M. Djurić,Shanshan Yao
出处
期刊:Small
[Wiley]
日期:2023-01-26
卷期号:19 (17)
被引量:4
标识
DOI:10.1002/smll.202205058
摘要
Lip-reading provides an effective speech communication interface for people with voice disorders and for intuitive human-machine interactions. Existing systems are generally challenged by bulkiness, obtrusiveness, and poor robustness against environmental interferences. The lack of a truly natural and unobtrusive system for converting lip movements to speech precludes the continuous use and wide-scale deployment of such devices. Here, the design of a hardware-software architecture to capture, analyze, and interpret lip movements associated with either normal or silent speech is presented. The system can recognize different and similar visemes. It is robust in a noisy or dark environment. Self-adhesive, skin-conformable, and semi-transparent dry electrodes are developed to track high-fidelity speech-relevant electromyogram signals without impeding daily activities. The resulting skin-like sensors can form seamless contact with the curvilinear and dynamic surfaces of the skin, which is crucial for a high signal-to-noise ratio and minimal interference. Machine learning algorithms are employed to decode electromyogram signals and convert them to spoken words. Finally, the applications of the developed lip-reading system in augmented reality and medical service are demonstrated, which illustrate the great potential in immersive interaction and healthcare applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI