Abstract Lead optimization plays an important role in preclinical drug discovery. While deep learning has accelerated this process, structure-based approaches that leverage 3D protein-ligand information remain underexplored. Existing models could improve predicted affinity but often yield synthetically inaccessible compounds, whereas screening-based methods limit chemical novelty by relying on fixed fragment libraries. To bridge the gap, we introduce Slogen—a S tructure-based L ead O ptimization algorithm unifying fragment G eneration and scre EN ing. To achieve this, Slogen integrates a transformer-based variational autoencoder, pretrained on the BindingNet v2 dataset, with an E(3)-equivariant graph neural network that models 3D protein–fragment interactions. This unified framework enables both fragment generation and similarity-based screening, simultaneously addressing synthetic tractability and structural diversity. Benchmarking study shows that Slogen matches or surpasses state-of-the-art methods while exploring broader chemical space. Case studies on the Smoothened and D1 dopamine receptors demonstrate its capacity to design high-affinity, drug-like molecules, providing a practical method for structure-guided lead optimization.