Introduction of the Python script STRinNGS for analysis of STR regions in FASTQ or BAM files and expansion of the Danish STR sequence database to 11 STRs

生物 遗传学 微卫星 基因座(遗传学) 序列数据库 大规模并行测序 STR复用系统 DNA测序 等位基因 基因
作者
Susanne Lunøe Friis,Anders Buchard,Eszter Rockenbauer,Claus Børsting,Niels Morling
出处
期刊:Forensic Science International-genetics [Elsevier BV]
卷期号:21: 68-75 被引量:35
标识
DOI:10.1016/j.fsigen.2015.12.006
摘要

This work introduces the in-house developed Python application STRinNGS for analysis of STR sequence elements in BAM or FASTQ files. STRinNGS identifies sequence reads with STR loci by their flanking sequences, it analyses the STR sequence and the flanking regions, and generates a report with the assigned SNP-STR alleles. The main output file from STRinNGS contains all sequences with read counts above 1% of the total number of reads per locus. STR sequences are automatically named according to the nomenclature used previously and according to the repeat unit definitions in STRBase (http://www.cstl.nist.gov/strbase/). The sequences are named with (1) the locus name, (2) the length of the repeat region divided by the length of the repeat unit, (3) the sequence(s) of the repeat unit(s) followed by the number of repeats and (4) variations in the flanking regions. Lower case letters in the main output file are used to flag sequences with previously unknown variations in the STRs. SNPs in the flanking regions are named by their "rs" numbers and the nucleotides in the SNP position. Data from 207 Danes sequenced with the Ion Torrent™ HID STR 10-plex that amplified nine STRs (CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D16S539, TH01, TPOX, vWA), and Amelogenin was analysed with STRinNGS. Sequencing uncovered five common SNPs near four STRs and revealed 20 new alleles in the 207 Danes. Three short homopolymers in the D8S1179 flanking regions caused frequent sequencing errors. In 29 of 3726 allele calls (0.8%), sequences with homopolymer errors were falsely assigned as true alleles. An in-house developed script in R compensated for these errors by compiling sequence reads that had identical STR sequences and identical nucleotides in the five common SNPs. In the output file from the R script, all SNP-STR haplotype calls were correct. The 207 samples and six additional samples were sequenced for D3S1358, D12S391, and D21S11 using the 454 GS Junior platform in this and a previous work. Overall, next generation sequencing (NGS) of the 11 STRs lowered the mean match probability 386 times and increased the typical paternity indexes (i.e. the geometric mean) for trios and duos 47 and 23 times, respectively, compared to the traditional PCR-CE typing of the same population.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
孤鸿影98完成签到 ,获得积分10
1秒前
1秒前
4秒前
5秒前
5秒前
大有阳光发布了新的文献求助10
9秒前
滕皓轩发布了新的文献求助10
11秒前
12秒前
12秒前
充电宝应助科研通管家采纳,获得10
12秒前
isedu完成签到,获得积分10
13秒前
大有阳光完成签到,获得积分10
17秒前
安详的曲奇完成签到,获得积分10
23秒前
滕皓轩发布了新的文献求助10
23秒前
白昼の月完成签到 ,获得积分0
26秒前
仲乔妹完成签到,获得积分10
30秒前
修狗狗完成签到,获得积分10
37秒前
37秒前
发嗲的慕蕊完成签到 ,获得积分10
48秒前
粗心的飞槐完成签到 ,获得积分10
52秒前
king完成签到 ,获得积分10
1分钟前
1分钟前
悠明夜月完成签到 ,获得积分10
1分钟前
1分钟前
朝阳完成签到 ,获得积分10
1分钟前
叶痕TNT完成签到 ,获得积分10
1分钟前
Miyano0818发布了新的文献求助30
1分钟前
Alandia完成签到 ,获得积分10
1分钟前
蒲蒲完成签到 ,获得积分10
1分钟前
三个气的大门完成签到 ,获得积分10
1分钟前
liang19640908完成签到 ,获得积分10
1分钟前
cgs完成签到 ,获得积分10
1分钟前
乐观的星月完成签到 ,获得积分10
1分钟前
眯眯眼的访冬完成签到 ,获得积分10
1分钟前
Tonald Yang完成签到 ,获得积分20
1分钟前
fff完成签到 ,获得积分10
1分钟前
王多肉完成签到,获得积分10
1分钟前
1分钟前
CodeCraft应助小小铱采纳,获得30
1分钟前
自由的无色完成签到 ,获得积分10
1分钟前
高分求助中
Technologies supporting mass customization of apparel: A pilot project 600
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
Walking a Tightrope: Memories of Wu Jieping, Personal Physician to China's Leaders 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3798521
求助须知:如何正确求助?哪些是违规求助? 3344082
关于积分的说明 10318430
捐赠科研通 3060628
什么是DOI,文献DOI怎么找? 1679732
邀请新用户注册赠送积分活动 806761
科研通“疑难数据库(出版商)”最低求助积分说明 763353