Abstract Background Personalized pretreatment dosimetry planning is crucial for optimizing [ 177 Lu]Lu–prostate‐specific membrane antigen‐617 (Lu‐PSMA) radioligand therapy (RLT) in patients with metastatic castration‐resistant prostate cancer (mCRPC). Purpose This study addresses two goals. First, we develop a machine learning (ML)‐based pretreatment planning model to predict post‐therapy absorbed doses (ADs) in metastatic lesions by integrating clinical biomarkers (CBs) with radiomic features (RFs) and dosiomic features (DFs) extracted from [⁶⁸Ga]Ga‐PSMA‐11 (Ga‐PSMA) positron emission tomography/computed tomography (PET/CT), thereby improving predictive accuracy. Second, we develop a transformer‐based deep learning (DL) architecture to predict Monte Carlo (MC)‐derived dose rate maps (DRMs), minimizing reliance on computationally intensive MC simulations. Methods For the ML objective, retrospective posttreatment dosimetry data from 20 patients with mCRPC treated with Lu‐PSMA RLT were used as ground truth labels. Patient‐specific MC dosimetry was employed on Ga‐PSMA PET/CT images using the GATE v9.1 toolkit to generate DRMs. After image preprocessing, RFs and DFs were extracted from Ga‐PSMA CT images and DRMs using LIFEx v7.4.0. Multiple feature selection techniques, including recursive feature elimination (RFE), mutual information, Boruta, LASSO, and Elastic Net, were applied and evaluated. The Benjamini‐Hochberg correction ( q < 0.05) was used to control for false discovery rate following each method. Multiple nonlinear regression models were trained using leave‐one‐out cross‐validation (LOOCV), and model interpretability was assessed using SHAP and LIME radar plots. A shifted windows UNET Transformers (Swin UNETR) architecture with self‐supervised learning (SSL) pretraining was employed to predict voxel‐wise PET‐based DRMs for the DL objective. The model was fine‐tuned on MC‐labelled DRM data from 30 patients (including 10 additional cases) using 5‐fold cross‐validation. Results Among multiple feature selection strategies, RFE was ultimately selected for final modelling based on its superior predictive performance. The ensemble tree regressor (ETR) using selected CT RFs, PET DFs, and significant CBs achieved an R 2 = 0.82 and RMSE = 0.67 Gy/GBq. For DRM prediction, the SSL‐pretrained Swin UNETR achieved an R 2 of 0.97, NRMSE of 0.003 Gy/GBq, and a Gamma pass rate of 99.08%, closely matching MC‐derived DRMs. Conclusions Integrating ML‐based radiodosiomics and transformer‐based DL enables accurate, efficient lesion AD and DRM prediction from pretherapy PET/CT, supporting personalized Lu‐PSMA RLT planning.