作者
Yan Sun,Lianghong Chen,Zihao Jing,Yan-yi Li,Dongkyu Kim,Jing-Yan Gao,Reza Noroozi,Grace Y. Yi,Conrard Giresse Tetsassi Feugmo,Anna Klinkova,Kyla Sask,Agustinus Kristiadi,Boyu Wang,Elizabeth R. Gillies,Kun Ping Lu,HaoTian Harvey Shi,Pingzhao Hu
摘要
The design of novel molecules underpins advances in both drug discovery and biomaterials engineering. Traditional approaches, from natural product isolation to high-throughput screening, have delivered important therapeutics but remain costly, inefficient, and limited in exploring the chemical and biomolecular space. While predictive machine learning models improved aspects of discovery, they cannot fully address the complexity of modern precision medicine. Generative artificial intelligence (AI) offers a paradigm shift by enabling de novo molecular creation guided by data-driven optimization. Architectures such as variational autoencoders, generative adversarial networks, normalizing flows, and diffusion models now demonstrate unprecedented capabilities in designing small molecules and macromolecules that satisfy complex physicochemical and biological requirements. This review surveys the rapidly evolving field of generative AI for molecular design. We first introduce the development of generative architectures and optimization strategies, focusing on how sampling, training, and postgeneration techniques improve control over molecular design. We then examine applications across molecular representations, unconstrained and property-constrained design, conformation modeling, and the generation of large biomolecules such as proteins, antibodies, and peptides. Benchmarking datasets, evaluation metrics, and real-world case studies, such as the AI-driven discovery of novel antibiotics demonstrated in vivo efficacy against multidrug-resistant infections, illustrate the growing maturity and translational potential of generative molecular design approaches. Despite rapid advances, generative molecular design still faces critical challenges that point to key future directions. These include integrating physicochemical priors through differentiable physical models, overcoming data scarcity via synthetic augmentation and transfer learning, enabling multimodal fusion of structural, omics, and phenotypic data, deploying autonomous AI agents for adaptive decision-making, and optimizing multiple objectives with uncertainty-aware strategies. Addressing these challenges could lead to more robust, generalizable, and experimentally aligned molecular design systems.