判别式
卷积神经网络
计算机科学
编码
人工智能
模式识别(心理学)
编码(内存)
特征提取
分歧(语言学)
深度学习
特征(语言学)
发起人
计算生物学
基因
生物
遗传学
基因表达
哲学
语言学
作者
Wenxuan Xu,Lin Zhu,De-Shuang Huang
标识
DOI:10.1109/tnb.2019.2891239
摘要
Efficient human promoter feature extraction is still a major challenge in genome analysis as it can better understand human gene regulation and will be useful for experimental guidance. Although many machine learning algorithms have been developed for eukaryotic gene recognition, performance on promoters is unsatisfactory due to the diverse nature. To extract discriminative features from human promoters, an efficient deep convolutional divergence encoding method (DCDE) is proposed based on statistical divergence (SD) and convolutional neural network (CNN). SD can help optimize kmer feature extraction for human promoters. CNN can also be used to automatically extract features in gene analysis. In DCDE, we first perform informative kmers settlement to encode original gene sequences. A series of SD methods can optimize the most discriminative kmers distributions while maintaining important positional information. Then, CNN is utilized to extract lower dimensional deep features by secondary encoding. Finally, we construct a hybrid recognition architecture with multiple support vector machines and a bilayer decision method. It is flexible to add new features or new models and can be extended to identify other genomic functional elements. The extensive experiments demonstrate that DCDE is effective in promoter encoding and can significantly improve the performance of promoter recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI