摘要
The expansion of publicly available resources and the adoption of electronic health records (EHRs) has enabled the use of AI methods for pharmacovigilance. Traditional methods for assessing preclinical safety, like quantitative structure–activity relationships (QSAR), are largely moving toward ensemble machine learning (ML) and deep learning (DL) approaches. Postmarketing pharmacovigilance relies on a variety of data sources such as molecular, chemoinformatic, and clinical databases, as well as social media and biomedical literature. DL-powered natural language processing (NLP) methods, including word embedding and attention mechanisms, are the techniques of choice to extract drug–adverse event (AE) relationships in text data. Interventional pharmacology is one of medicine's most potent weapons against disease. These drugs, however, can result in damaging side effects and must be closely monitored. Pharmacovigilance is the field of science that monitors, detects, and prevents adverse drug reactions (ADRs). Safety efforts begin during the development process, using in vivo and in vitro studies, continue through clinical trials, and extend to postmarketing surveillance of ADRs in real-world populations. Future toxicity and safety challenges, including increased polypharmacy and patient diversity, stress the limits of these traditional tools. Massive amounts of newly available data present an opportunity for using artificial intelligence (AI) and machine learning to improve drug safety science. Here, we explore recent advances as applied to preclinical drug safety and postmarketing surveillance with a specific focus on machine and deep learning (DL) approaches. Interventional pharmacology is one of medicine's most potent weapons against disease. These drugs, however, can result in damaging side effects and must be closely monitored. Pharmacovigilance is the field of science that monitors, detects, and prevents adverse drug reactions (ADRs). Safety efforts begin during the development process, using in vivo and in vitro studies, continue through clinical trials, and extend to postmarketing surveillance of ADRs in real-world populations. Future toxicity and safety challenges, including increased polypharmacy and patient diversity, stress the limits of these traditional tools. Massive amounts of newly available data present an opportunity for using artificial intelligence (AI) and machine learning to improve drug safety science. Here, we explore recent advances as applied to preclinical drug safety and postmarketing surveillance with a specific focus on machine and deep learning (DL) approaches. a process that allows one to look at elements in a sequence as a whole and learn a distribution that weighs their contextual importance. concentration for half-maximal activity, as derived from the Hill equation. It is a common potency measure used in toxicity testing. area between the curve and the x axis. Heuristic used to evaluate the performance of classification models, with AUC = 1 indicating perfect classification. Bayesian and Frequentist are two different approaches to defining probabilities. The Bayesian approach is to see a representation of uncertainty, while frequentists see probabilities as a long-term frequency of an event. a class of methods in which the complexity of a model is defined by the data. pioneered by LeCun et al. in 1990 [95.LeCun Y. et al.Handwritten digit recognition with a back-propagation network.in: Advances in Neural Information Processing Systems 2. Morgan Kaufmann, 1990: 396-404Google Scholar] in their first application for handwritten digit recognition. They consist of training filters that perform convolutional products with the input data and learn an increasing number of high-level features. ConvNets have to learn comparatively fewer weights than fully connected neural networks, and are particularly efficient for computer vision applications. refers to the number of attributes present in a dataset. a subfield of ML in which the algorithms learn abstractions of the input features that they use to make the predictions. These algorithms are characterized by a higher capacity than classic ML techniques (i.e., have higher degrees of freedom). It is in essence the trade-off of DL: what we gain in capacity and automated feature engineering, we lose in a higher dimensional space of parameters that is more complex and time-consuming to explore. distance metric based on properties of graph diffusion designed to capture distinctions between annotations in protein–protein interactions. a method to generate semantic vector representations of biomedical terms inspired by the skipgram-with-negative-sampling (SGNS). SGNS is an embedding method that uses neural networks to associate terms to their context in a corpus. a stochastic variable selection method that solves optimization problems by applying Darwinian hypotheses of evolution. an approach used for classification and regression that builds a predictive model using a combination of individually weaker prediction models. resampling procedure in which the data is split into k groups in order to estimate and assess model performance. algorithm used for classification in which the data are separated into several classes to help predict the classification of a new data point. statistical method used to describe variability in observed and correlated variables in term of unobserved variables called latent factors. amount of an administered substance that kills 50% of a test sample. statistical ML technique that seeks to find a linear combination of features that separates two or more classes. a field of AI in which algorithms are trained to perform tasks and make predictions by learning directly from the data, without being explicitly programmed. ML methods can broadly be classified into two classes based on how the data learn to make predictions: supervised and unsupervised learning. In supervised learning, an algorithm is used to learn the mapping between input variables and an output, such as a label. The goal is for the algorithm to learn to predict a correct output when a new input is provided. In unsupervised learning, there are no assigned labels to the input training data. Here, the machine's goal is to learn representations of the input data that can be used for tasks such as predicting future inputs, and decision-making, without an output. set of supervised learning algorithm based on Bayes theorem with the assumption of conditional independence between feature pairs a method that identifies tokens in unstructured text and maps them to concepts or categories in terminologies. subfield of computer science that aims at manipulating and making sense of natural language data. ensemble learning approach in classification and regression which constructs decision trees during training and produces the class (classification) or mean predication (regression) of individual trees. statistical approach that seeks to find relationships between dependent variables and one or more independent variables. Multivariate regression estimates a single regression model with more than one outcome variable. method in epidemiological study design in which the subjects are their own control. in chemoinformatics, a model where, for a molecule of N atoms, a multilayer perceptron recursively crawls through N different representations of this molecule. The vectors generated as a result are then averaged to compute a prediction for the molecule. In NLP, one of the main challenges is finding a good representation for the vocabulary the corpus covers. While some methods simply encode tokens in binary vectors with a sparse representation, word embeddings learn a representation that considers the token's context.