UTexas Aptamer Dataset

适体计算生物学计算机科学生物遗传学

作者

Ali Askari,Sumedha Kota,Hailey Ferrell,Shriya Swamy,Kayla S. Goodman,Christine C. Okoro,Isaiah C. Spruell Crenshaw,Daniela K. Hernandez,Taylor E. Oliphant,Akshata A. Badrayani,Andrew D. Ellington,Gwendolyn M. Stovall

出处

期刊：CERN European Organization for Nuclear Research - Zenodo [European Organization for Nuclear Research]
日期：2023-08-19

链接

zenodo.org zenodo.org datacite.orgdoi.org

标识

DOI：10.5281/zenodo.8264921

摘要

The deposited dataset is a snapshot of the data in the active and growing UTexas Aptamer Database, https://sites.utexas.edu/aptamerdatabase/. This dataset is a collection of aptamer data that has been extracted from the literature every year since the inception of aptamer selections and includes multiple aptamer sequences from a given paper (as opposed to just sequences with the tightest binding). In all, the collection includes 1,415 aptamer sequences from 489 papers published over the last few decades (1990-2022). Since our dataset includes multiple sequences that emerged from a given selection experiment, it of necessity includes sequences that may not have been individually tested for binding activity, similar to the inclusion of all rRNA sequences in a metagenomic analysis of an environmental sample. By taking this metagenomic approach, we provide informaticians with a much wider range of sequences for subsequent analysis while still providing tools to find high-affinity aptamers for future use. For each aptamer sequence, the dataset includes information about the aptamer publication (i.e., year of publication, DOI, full citation, and corresponding author(s)), the aptamer target, as well as the following information about the specific aptamer: nucleic acid composition, name assigned in the original publication, sequence, GC percentage, sequence length, binding affinity (K_d), binding/selection buffer, application as quoted in the referenced paper (e.g., drug delivery, biosensor, etc.), original nucleic acid pool used in the aptamer selection, post-selection modifications (if any), additional information, and our internally assigned serial number. We used simple Excel formulas for each aptamer record to calculate the GC content and length of each aptamer sequence. 1.1.0 Version: Added 25+ aptamer sequences. Added the "Parent sequence serial number" data field/column. Fixed "Application as quoted in the referenced paper" data formatting/alignment error.

求助该文献

最长约 10秒，即可获得该文献文件

UTexas Aptamer Dataset

今日热心研友