Recent advances of data‐independent acquisition mass spectrometry‐based proteomics

自下而上蛋白质组学 质谱法 蛋白质组学 蛋白质组 串联质谱法 等压标记 化学 串联质量标签 色谱法 肽质量指纹图谱 样品制备 自上而下的蛋白质组学 定量蛋白质组学 碎片(计算) 蛋白质质谱法 计算生物学 生物 生物化学 基因 生态学
作者
Tiannan Guo,Ruedi Aebersold
出处
期刊:Proteomics [Wiley]
卷期号:23 (7-8) 被引量:8
标识
DOI:10.1002/pmic.202200011
摘要

Bottom-up proteomics is a mass spectrometry-based method to analyze the contents of complex protein samples. Pioneered in the 1990s, it consists of converting protein samples into peptide samples by enzymatic digestion, the separation of peptides by (typically) reverse phase liquid chromatography (LC), and the analysis of the eluting peptides by tandem mass spectrometry. This general approach, while tremendously successful and widely used, has faced from the beginning the fundamental challenge that the number of peptides generated by the digestion of a complex protein sample, like cell extracts or body fluids, is significantly larger than the number of peptides expected by the application of the tryptic digestion rule [1]. In fact, the number of peptides expected from a proteome is presently unknown. Ironically, while the genes and transcripts could be comprehensively sequenced and characterized, the exact number of protein types or their cellular copy number in any biomedical sample remains unknown. The challenge to address this fundamental issue has spawned a large number of strategies for mass spectrometric data acquisition and analysis. Two major bottom-up proteomics approaches have been developed. Data-dependent acquisition (DDA) essentially prioritizes peptide precursors based on their signal intensity in a precursor ion scan in the mass spectrometer, and then subsequentially selects a number of precursors for fragmentation, generating MS2 spectra. This is a well-established MS method, which gains sample throughput when coupled with stable isotope-labeling of the peptides using, for example, TMTpro. However, since the number of peptide precursors is substantially larger than the number of fragment ion spectra a mass spectrometer can acquire, only a limited number of peptide precursors could be analyzed, leaving out a varying and unknown portion of the proteome uncharacterized in each DDA data acquisition. This undersampling issue becomes more pronounced when the LC gradient is minimized to maximize sample throughput. Therefore, it is unlikely that DDA data acquisition, even with extensive sample fractionation and extremely long LC gradient, will overcome this fundamental undersampling issue. Another emerging and widely adopted approach for bottom-up proteomics data acquisition is data-independent acquisition (DIA). DIA bins the peptide precursors into predefined groups based on their m/z values, performs fragmentation for each group (also called “window”) of peptide precursors sequentially, and records the highly convoluted MS2 spectrum for the fragments and unfragmented precursors in each window [2]. This method essentially generates a comprehensive digital map of all the flyable and fragmentable peptide precursors of a proteome. Therefore, compared with DDA which is inherently limited by the undersampling issue, it is theoretically possible to identify every protein in a proteome from a digital proteome map generated by DIA. Various computational methods have been developed to analyze data acquired by DIA. They can be grouped conceptually into peptide-centric and spectrum-centric approaches, the terminology of MacCoss and colleagues [3]. With the spectrum-centric approach, each tandem mass spectrum is interpreted by searching against a theoretical or experimental protein sequence database and a matched decoy database. This approach is usually used for DDA data. Principally, it can also be applied to the highly convoluted DIA data, but DIA data is most effectively interpreted with the peptide-centric approach, which basically asks the question: is a peptide of interest present in the data? Briefly, the peptide-centric approach first compiles the characteristics (including the m/z of peptide precursors and fragments, retention time and the elution profiles, among others) of a peptide precursor of interest into a data table (eg. reference spectral library), and tries to find this pattern in the DIA data using statistical and machine learning algorithms [4]. In principle, the combination of DIA data acquisition and peptide-centric data analysis strategy allows analysis of every protein which is analyzable in a proteome within the limits of the analytical techniques used. Since 2010, over 1000 publications have been published using DIA. This special issue features some of the latest advances in the field. Penny et al. reported a gas phase fractionation acquisition scheme called (ion mobility) IM - (gas phase fractionation) GPF, for rapid diaPASEF library generation [5]. Most DIA analyses are performed in single injections even for complex samples. The elimination of extensive sample fractionation not only minimizes technical variability and required sample amount, but also substantially increases the sample throughput. In this issue, Bons et al. applied DIA to study small amounts of extracellular matrix of lung cancer tissue specimens [6], while Wang et al. analyzed enriched glycoproteins in urine samples from prostate cancer patients [7]. Kverneland et al. developed a simple ultracentrifugation protocol for the enrichment of extracellular vesicles from plasma samples, enabling characterization of over 2500 plasma proteins with DIA runs of less than 1 h [8]. These three applications exemplify the superb sensitivity and comprehensiveness of DIA-MS for analyzing a specific subproteome. Oliinyk et al. reported that only 1 h MS time using dia-PASEF characterized over 13,000 phosphopeptides from about 20 ug protein digests, while shortening the gradient by a factor of 4 led to similar coverage of the phosphoproteome [9]. The type of application which requires both high sensitivity and high throughput is currently only practical with DIA-MS. Messner et al. argued that perturbation proteomics is an essential approach to study highly dynamic biological systems and that short-gradient DIA coupled with fast LC systems is the method of choice for such applications [10]. In addition, since DIA basically acquires peptide precursor and fragment data for all flyable and fragmentable ions, it has unique advantages for comprehensive analysis of protein PTMs. Yang et al. reviewed the current status of DIA-based PTM detection, site localization, and characterization of glycans [11]. In particular, they reviewed the contribution of deep learning in DIA library generation and data interpretation. Pham et al. reported an emerging deep learning algorithm called transformer architecture for retention time prediction which exhibited superb performance compared to multiple existing deep learning software tools [12]. DIA is effective in analyzing the proteome of small amounts of clinical tissue specimens in high-throughput with high degree of reproducibility and has been widely used in large cohorts [13]. Encouraged by the recent advances in DIA-based clinical proteomics, Boys et al. from ProCan asked the question: where are we in the context of (large-scale) clinical applications of MS-based proteomics? Even though DIA has been applied to identify disease diagnostic biomarkers and potential therapeutic targets, MS-based proteomics has as yet not been widely implemented in routine clinical diagnostics. They discussed the hurdles and proposed actions to move forward including integration of multiomics measurements, and development of targeted proteomic assays [14]. Poulos et al., from ProCan too, discussed the potential application of DIA-MS from a different angle, that is drug discovery [15]. High-throughput DIA analyses can be applied to analyze the perturbation proteome in tumor cells. The resultant proteomic data could be potentially used for drug responsiveness prediction using machine learning. Although we have not seen wide applications of MS-based proteomics in drug discovery, the potential is clear and indisputable. Toward clinical applications of DIA-based proteomics, standardization of sample preparation, MS data acquisition and data storage, as well as analysis is essential. Indeed, although the volume of DIA data is accumulating, discussion of DIA data management largely lags behind. Here, Jones et al. [16] discussed the findability, accessibility, interoperability, and reusability (FAIR) of the increasing volume of DIA data, and proposed expert recommendations for the future. Several exciting research fields which are also driven by DIA-based proteomics are not included in this special issue but are actively advancing, such as single-cell proteomics, spatial proteomics, and proteogenomics.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
三岁半发布了新的文献求助10
刚刚
1秒前
Dr发布了新的文献求助10
1秒前
bkagyin应助dong采纳,获得10
1秒前
yangxiaoya发布了新的文献求助10
1秒前
科研通AI5应助标致千凡采纳,获得10
2秒前
幸福小松鼠完成签到,获得积分10
2秒前
2秒前
如意契发布了新的文献求助10
3秒前
3秒前
阿离完成签到,获得积分10
3秒前
3秒前
ding应助zhang采纳,获得10
4秒前
搜集达人应助ShengzhangLiu采纳,获得10
4秒前
4秒前
MYN发布了新的文献求助10
5秒前
小嘎完成签到,获得积分10
5秒前
英俊的铭应助Kevin采纳,获得10
5秒前
5秒前
充电宝应助xin采纳,获得10
6秒前
李梓权发布了新的文献求助10
6秒前
7秒前
7秒前
李健的小迷弟应助LMM采纳,获得10
7秒前
jkdajsk发布了新的文献求助10
7秒前
优美采梦发布了新的文献求助10
8秒前
随心发布了新的文献求助10
8秒前
8秒前
8秒前
喝喂辉完成签到,获得积分10
8秒前
半瓶水才快乐完成签到,获得积分10
9秒前
阿玲完成签到,获得积分10
9秒前
misa完成签到 ,获得积分10
10秒前
10秒前
蒋瑞轩发布了新的文献求助10
11秒前
斯文的念文完成签到,获得积分10
12秒前
深情安青应助pck1212123采纳,获得10
12秒前
lihuanmoon完成签到,获得积分10
12秒前
俊逸的卿发布了新的文献求助10
13秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3790460
求助须知:如何正确求助?哪些是违规求助? 3335150
关于积分的说明 10273529
捐赠科研通 3051578
什么是DOI,文献DOI怎么找? 1674737
邀请新用户注册赠送积分活动 802803
科研通“疑难数据库(出版商)”最低求助积分说明 760907