单倍型
单倍群
计算机科学
SNP公司
法医学
算法
人工智能
计算生物学
遗传学
机器学习
模式识别(心理学)
生物
单核苷酸多态性
基因型
基因
作者
Mengyuan Song,Chenxi Zhao,Zheng Wang,Yiping Hou
标识
DOI:10.1016/j.fsigss.2019.10.120
摘要
Abstract Y-chromosome single nucleotide polymorphisms (Y-SNPs) have lower mutation rate compared with Y-chromosome short tandem repeats (Y-STRs), thus more informative in paternal lineage identification. Here we present a case about the personal identification of an unidentified cadaver using machine learning methods to determine Y-SNP haplogroup by Y-STR haplotype. Two possible haplotypes from two different male lineages were found after searching national Y-STR databases. Six methods, k-Nearest Neighbor, Naive Bayesian Model, Logistic Regression, Support Vector Machine, Decision Tree, and Random Forest were used to predict the haplogroup based on Y-STR haplotype. These two haplotypes are predicted into two different haplogroups, O2a2b1a2a1 and O2a2b1a2a1a3. The predicted results were further verified by Y-SNP genotyping. It indicates that the mismatch of the two haplotypes may not originate from mutation, but due to different lineages. In this case, machine learning algorithms, especially Support Vector Machine and Random Forest show the potential of discriminating different lineages.
科研通智能强力驱动
Strongly Powered by AbleSci AI