染色体
标杆管理
计算机科学
核型
人工智能
细胞遗传学
染色体分析
数据挖掘
生物
遗传学
基因
营销
业务
作者
Chengchuang Lin,Hanbiao Chen,Jie‐Sheng Huang,Jing Peng,Li Guo,Zhirong Yang,Jiahua Du,Shuangyin Li,Aihua Yin,Gansen Zhao
标识
DOI:10.1016/j.compbiolchem.2022.107731
摘要
Chromosome karyotyping analysis is a vital cytogenetics technique for diagnosing genetic and congenital malformations, analyzing gestational and implantation failures, etc. Since the chromosome classification as an essential stage in chromosome karyotype analysis is a highly time-consuming, tedious, and error-prone task, which requires a large amount of manual work of experienced cytogenetics experts. Many deep learning-based methods have been proposed to address the chromosome classification issues. However, two challenges still remain in current chromosome classification methods. First, most existing methods were developed by different private datasets, making these methods difficult to compare with each other on the same base. Second, due to the absence of reproducing details of most existing methods, these methods are difficult to be applied in clinical chromosome classification applications widely. To address the above challenges in the chromosome classification issue, this work builds and publishes a massive clinical dataset. This dataset enables the benchmarking and building chromosome classification baselines suitable for different scenarios. The massive clinical dataset consists of 126,453 privacy preserving G-band chromosome instances from 2763 karyotypes of 408 individuals. To our best knowledge, it is the first work to collect, annotate, and release a publicly available clinical chromosome classification dataset whose data size scale is also over 120,000. Meanwhile, the experimental results show that the proposed dataset can boost performance of existing chromosome classification models at a varied range of degrees, with the highest accuracy improvement by 5.39 % points. Moreover, the best baseline with 99.33 % accuracy reports state-of-the-art classification performance. The clinical dataset and state-of-the-art baselines can be found at https://github.com/CloudDataLab/BenchmarkForChromosomeClassification.
科研通智能强力驱动
Strongly Powered by AbleSci AI