The Molecule Cloud - compact visualization of large collections of molecules

计算机科学 数据挖掘 标记云 集合(抽象数据类型) 数据可视化 交互式可视化 可视化 情报检索 化学信息学 计算科学 可扩展性 聚类分析 分子 云计算 网格 理论计算机科学
作者
Peter Ertl,Bernhard Rohde
出处
期刊:Journal of Cheminformatics [Springer Nature]
卷期号:4 (1) 被引量:29
标识
DOI:10.1186/1758-2946-4-12
摘要

Analysis and visualization of large collections of molecules is one of the most frequent challenges cheminformatics experts in pharmaceutical industry are facing. Various sophisticated methods are available to perform this task, including clustering, dimensionality reduction or scaffold frequency analysis. In any case, however, viewing and analyzing large tables with molecular structures is necessary. We present a new visualization technique, providing basic information about the composition of molecular data sets at a single glance.A method is presented here allowing visual representation of the most common structural features of chemical databases in a form of a cloud diagram. The frequency of molecules containing particular substructure is indicated by the size of respective structural image. The method is useful to quickly perceive the most prominent structural features present in the data set. This approach was inspired by popular word cloud diagrams that are used to visualize textual information in a compact form. Therefore we call this approach "Molecule Cloud". The method also supports visualization of additional information, for example biological activity of molecules containing this scaffold or the protein target class typical for particular scaffolds, by color coding. Detailed description of the algorithm is provided, allowing easy implementation of the method by any cheminformatics toolkit. The layout algorithm is available as open source Java code.Visualization of large molecular data sets using the Molecule Cloud approach allows scientists to get information about the composition of molecular databases and their most frequent structural features easily. The method may be used in the areas where analysis of large molecular collections is needed, for example processing of high throughput screening results, virtual screening or compound purchasing. Several example visualizations of large data sets, including PubChem, ChEMBL and ZINC databases using the Molecule Cloud diagrams are provided.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
建议保存本图,每天支付宝扫一扫(相册选取)领红包
实时播报
哈呼呼完成签到,获得积分20
2秒前
4秒前
cx完成签到,获得积分20
5秒前
所所应助ruochenzu采纳,获得10
5秒前
月夕完成签到 ,获得积分10
5秒前
漓汐完成签到,获得积分10
5秒前
6秒前
6秒前
科研通AI2S应助漫步云端采纳,获得10
7秒前
天使爱美丽完成签到 ,获得积分10
9秒前
10秒前
10秒前
甜甜球发布了新的文献求助10
10秒前
一颗开心果应助xiaoxiaoz采纳,获得100
11秒前
JamesPei应助科研通管家采纳,获得10
12秒前
12秒前
小二郎应助科研通管家采纳,获得10
12秒前
cctv18应助科研通管家采纳,获得10
13秒前
搜集达人应助科研通管家采纳,获得10
13秒前
阿呆发布了新的文献求助10
13秒前
shinysparrow应助科研通管家采纳,获得200
13秒前
海棠石三应助科研通管家采纳,获得10
13秒前
13秒前
13秒前
13秒前
adeno发布了新的文献求助10
14秒前
泡芙发布了新的文献求助20
16秒前
zyn关注了科研通微信公众号
17秒前
18秒前
18秒前
蒋念寒发布了新的文献求助10
18秒前
丘比特应助缓慢秋灵采纳,获得10
18秒前
19秒前
阿呆完成签到,获得积分10
19秒前
keyang发布了新的文献求助10
20秒前
21秒前
Ava应助栗子采纳,获得10
23秒前
123发布了新的文献求助10
23秒前
深情安青应助boyeer采纳,获得10
23秒前
Alex完成签到,获得积分10
23秒前
高分求助中
Teaching Social and Emotional Learning in Physical Education 1000
Guide to Using WVASE Spectroscopic Ellipsometry Data Acquisition and Analysis Software 600
Multifunctionality Agriculture: A New Paradigm for European Agriculture and Rural Development 500
grouting procedures for ground source heat pump 500
ANDA Litigation: Strategies and Tactics for Pharmaceutical Patent Litigators Second 版本 500
Exact Solutions of the Discrete Heat Conduction Equations 500
中国志愿服务发展报告(2022~2023) 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2334973
求助须知:如何正确求助?哪些是违规求助? 2021544
关于积分的说明 5062259
捐赠科研通 1772323
什么是DOI,文献DOI怎么找? 887247
版权声明 555705
科研通“疑难数据库(出版商)”最低求助积分说明 472675