计算机科学
人工智能
自然语言处理
分类器(UML)
情报检索
规范化(社会学)
模式识别(心理学)
人类学
社会学
作者
Qi Jia,Dezheng Zhang,Shibing Yang,Chao Xia,Yingjie Shi,Tao Hu,Cong Xu,Xiong Luo,Yuekun Ma,Ying Xie
标识
DOI:10.1016/j.jbi.2021.103718
摘要
Traditional Chinese medicine (TCM) symptom normalization is difficult because the challenges of the symptoms having different literal descriptions, one-to-many symptom descriptions and different symptoms sharing a similar literal description. We propose a novel two-step approach utilizing hierarchical semantic information that represents the functional characteristics of symptoms and develop a text matching model that integrates hierarchical semantic information with an attention mechanism to solve these problems. In this study, we constructed a symptom normalization dataset and a TCM normalization symptom dictionary containing normalization symptom words, and assigned symptoms into 24 classes of functional characteristics. First, we built a multi-label text classifier to isolate the hierarchical semantic information from each symptom description and count the corresponding normalization symptoms and filter the candidate set. Then we designed a text matching model of mixed multi-granularity language features with an attention mechanism that utilizes the hierarchical semantic information to calculate the matching score between the symptom description and the normalization symptom words. We compared our approach with other baselines on real-world data. Our approach gives the best performance with a [email protected] 1, 3, and 10 of 0.821, 0.953, and 0.993, respectively, and a MeanRank of 1.596, thus outperforming significantly regarding the symptom normalization task. We developed an approach for the TCM symptom normalization task and demonstrated its superior performance compared with other baselines, indicating the promise of this research direction.
科研通智能强力驱动
Strongly Powered by AbleSci AI