摘要
The accurate prediction of drug-induced side effects remains a significant challenge in pharmaceutical development, particularly in early development, as drug programs often fail due to unforeseen adverse reactions. Conventional approaches, such as preclinical animal testing and in vitro assays, are limited by high costs, ethical concerns, and reduced translatability to human biology. Predictive algorithms, including protein-protein interaction network models, have emerged as a computational approach with the potential to predict adverse drug effects, but these models often suffer from limited performance, specifically underprediction. Further, the documented drug-protein interactions are inconsistently reported, affecting our ability to select data for predicting drug-induced side effects or building more performant models. We integrated drug-binding targets from six sources: DrugBank, ChEMBL, PubChem, Search Tool for Interacting Chemicals, Therapeutic Target Database, and PocketFEATURE into our existing platform, PathFX, to understand their impact on the prediction of drug side effects. We observed unique drug-target interactions and target-associated protein classes and functions across sources. Integrating new drug targets predicted previously unrecognized side effects and revealed a trade-off between sensitivity and specificity. Sensitivity generally improved using large exploratory databases or the union of all targets, at the cost of reduced specificity. Databases with smaller numbers of curated targets or structurally predicted targets improved specificity. This quantitative analysis lays the foundation for improvement of drug-side effect prediction, where sophisticated machine learning approaches may better leverage large exploratory databases when balanced and performant analysis is required, and smaller, curated data sources could be integrated with simple but explainable platforms, like PathFX, for hypothesis generation.