染色质
转录因子
计算生物学
调节顺序
序列(生物学)
计算机科学
语法
生物
遗传学
人工智能
基因
作者
Anusri Pampari,Anna Shcherbina,Evgeny Z. Kvon,Michael Kosicki,Surag Nair,Soumya Kundu,Arwa S. Kathiria,Viviana I. Risca,Kristiina Kuningas,Kaur Alasoo,William J. Greenleaf,L Pennacchio,Anshul Kundaje
标识
DOI:10.1101/2024.12.25.630221
摘要
Despite extensive mapping of cis-regulatory elements (cREs) across cellular contexts with chromatin accessibility assays, the sequence syntax and genetic variants that regulate transcription factor (TF) binding and chromatin accessibility at context-specific cREs remain elusive. We introduce ChromBPNet, a deep learning DNA sequence model of base-resolution accessibility profiles that detects, learns and deconvolves assay-specific enzyme biases from regulatory sequence determinants of accessibility, enabling robust discovery of compact TF motif lexicons, cooperative motif syntax and precision footprints across assays and sequencing depths. Extensive benchmarks show that ChromBPNet, despite its lightweight design, is competitive with much larger contemporary models at predicting variant effects on chromatin accessibility, pioneer TF binding and reporter activity across assays, cell contexts and ancestry, while providing interpretation of disrupted regulatory syntax. ChromBPNet also helps prioritize and interpret regulatory variants that influence complex traits and rare diseases, thereby providing a powerful lens to decode regulatory DNA and genetic variation.
科研通智能强力驱动
Strongly Powered by AbleSci AI