An overview of publicly available patient-centered prostate cancer datasets

计算机科学 数据科学 互联网 大数据 数据挖掘 情报检索 数据库 万维网
作者
Tim Hulsen
出处
期刊:Translational Andrology and Urology [AME Publishing Company]
卷期号:8 (S1): S64-S77 被引量:21
标识
DOI:10.21037/tau.2019.03.01
摘要

Prostate cancer (PCa) is the second most common cancer in men, and the second leading cause of death from cancer in men. Many studies on PCa have been carried out, each taking much time before the data is collected and ready to be analyzed. However, on the internet there is already a wide range of PCa datasets available, which could be used for data mining, predictive modelling or other purposes, reducing the need to setup new studies to collect data. In the current scientific climate, moving more and more to the analysis of "big data" and large, international, multi-site projects using a modern IT infrastructure, these datasets could be proven extremely valuable. This review presents an overview of publicly available patient-centered PCa datasets, divided into three categories (clinical, genomics and imaging) and an "overall" section to enable researchers to select a suitable dataset for analysis, without having to go through days of work to find the right data. To acquire a list of human PCa databases, scientific literature databases and academic social network sites were searched. We also used the information from other reviews. All databases in the combined list were then checked for public availability. Only databases that were either directly publicly available or available after signing a research data agreement or retrieving a free login were selected for inclusion in this review. Data should be available to commercial parties as well. This paper focuses on patient-centered data, so the genomics data section does not include gene-centered databases or pathway-centered databases. We identified 42 publicly available, patient-centered PCa datasets. Some of these consist of different smaller datasets. Some of them contain combinations of datasets from the three data domains: clinical data, imaging data and genomics data. Only one dataset contains information from all three domains. This review presents all datasets and their characteristics: number of subjects, clinical fields, imaging modalities, expression data, mutation data, biomarker measurements, etc. Despite all the attention that has been given to making this overview of publicly available databases as extensive as possible, it is very likely not complete, and will also be outdated soon. However, this review might help many PCa researchers to find suitable datasets to answer the research question with, without the need to start a new data collection project. In the coming era of big data analysis, overviews like this are becoming more and more useful.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
小蘑菇应助111采纳,获得10
1秒前
1秒前
1秒前
奶黄包发布了新的文献求助10
1秒前
英姑应助xxxxxxx采纳,获得10
2秒前
2秒前
情怀应助猕猴桃大王采纳,获得10
2秒前
祁夏夏完成签到,获得积分10
3秒前
3秒前
3秒前
科研助手6应助舒适路人采纳,获得10
3秒前
4秒前
柳白完成签到,获得积分20
4秒前
zh123发布了新的文献求助20
4秒前
研友_ZGAeoL完成签到,获得积分10
5秒前
辣小扬发布了新的文献求助10
5秒前
5秒前
任善若完成签到 ,获得积分10
6秒前
zheng发布了新的文献求助10
6秒前
7秒前
黄bb应助小豆芽儿采纳,获得10
7秒前
zzz发布了新的文献求助10
7秒前
搜集达人应助YOLO采纳,获得10
7秒前
平常雪柳发布了新的文献求助10
7秒前
8秒前
大个应助yeyeye采纳,获得10
8秒前
8秒前
8秒前
wyx发布了新的文献求助10
9秒前
10秒前
科研通AI5应助快乐听南采纳,获得10
11秒前
楼一笑发布了新的文献求助10
11秒前
六月歌者发布了新的文献求助10
12秒前
花城发布了新的文献求助10
12秒前
wennuan0913发布了新的文献求助10
12秒前
充电宝应助如意草丛采纳,获得10
13秒前
13秒前
撕裂伤口发布了新的文献求助10
13秒前
抹茶牛奶配布丁完成签到 ,获得积分10
14秒前
xxx77发布了新的文献求助10
14秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Encyclopedia of Geology (2nd Edition) 2000
Technologies supporting mass customization of apparel: A pilot project 450
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
Brain and Heart The Triumphs and Struggles of a Pediatric Neurosurgeon 400
Cybersecurity Blueprint – Transitioning to Tech 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3786700
求助须知:如何正确求助?哪些是违规求助? 3332381
关于积分的说明 10255367
捐赠科研通 3047723
什么是DOI,文献DOI怎么找? 1672668
邀请新用户注册赠送积分活动 801476
科研通“疑难数据库(出版商)”最低求助积分说明 760204