DNA测序
计算生物学
巨量平行
统计模型
生物
计算机科学
大规模并行测序
贝叶斯概率
吞吐量
DNA
遗传学
人工智能
并行计算
电信
无线
作者
Andrew R. Ghazi,Xianguo Kong,Ed Chen,Leonard C. Edelstein,Chad A. Shaw
摘要
NGS studies have uncovered an ever-growing catalog of human variation while leaving an enormous gap between observed variation and experimental characterization of variant function. High-throughput screens powered by NGS have greatly increased the rate of variant functionalization, but the development of comprehensive statistical methods to analyze screen data has lagged behind. In the massively parallel reporter assay (MPRA), short barcodes are counted by sequencing DNA libraries transfected into cells and output RNA in order to simultaneously measure the shifts in transcription induced by thousands of genetic variants. These counts present many statistical challenges, including over-dispersion, depth dependence, and uncertain DNA concentrations. So far, the statistical methods used have been rudimentary, employing transformations on count level data and disregarding experimental and technical structure while failing to quantify uncertainty in the statistical model.
科研通智能强力驱动
Strongly Powered by AbleSci AI