Problem definition: Motivated by the significance of side information in numerous operations management problems, this paper studies conditional stochastic optimization to enable more informed decisions. The side information constitutes observable exogenous covariates that alter the conditional probability distribution of the random problem parameters. Decision makers who adapt their decisions according to the observed side information solve a stochastic optimization problem where the objective function is specified by the conditional expectation of the random cost. If the joint probability distribution is unknown, then the conditional expectation can be approximated in a data-driven manner using kernel regression. Although this approximation scheme has found successful applications in diverse decision problems under uncertainty, it is largely unknown whether the scheme can provide any reasonable out-of-sample performance guarantees, and how such statistical guarantees can guide the decision-making process. Methodology/results: We employ the Nadaraya–Watson kernel regression for data-driven approximation of the conditional expectation and leverage moderate deviations theory to establish its performance guarantees. Our analysis and resultant statistical bounds motivate the use of a conditional standard deviation regularization scheme to enhance out-of-sample performances. As the designed regularization scheme leads to a nonconvex optimization problem, we further adopt ideas from distributionally robust optimization to obtain tractable formulations. We examine our proposed models on portfolio optimization, inventory management, and wind energy commitment problems. The numerical results demonstrate the effectiveness of our proposed regularization scheme. Managerial implications: Our paper illustrates the importance of side information in real-world decision-making problems. Incorporating side information through a regularized Nadaraya–Watson scheme offers managers a robust framework to enhance decision making under uncertainty. The theoretical guarantees provide guidance on the number of samples required to obtain high-quality solutions and how to optimally adjust the regularization parameter. For problem instances with high-dimensional covariates, we further present a simple dimensionality reduction procedure that helps improve the sample complexity of the scheme. All our proposed formulations are concise and straightforward for the operations manager to implement using any popular programming language interfaced with standard off-the-shelf solvers. Funding: Y. Wang was supported by the National Natural Science Foundation of China [Grant 72501204] and the Fundamental Research Funds for the Central Universities. G. A. Hanasusanto was funded by the National Science Foundation [Grants CCF-2343869 and ECCS-2404413]. C. P. Ho was supported by the Research Grants Council [General Research Fund 11508623] and the CityUHK Start-Up Grant [9610481]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2024.0997 .