ABSTRACT This paper aims at solving the infinite‐horizon stochastic linear quadratic (SLQ) optimal control problem online for continuous‐time systems with both additive and multiplicative noises. To eliminate the requirement for prior knowledge of system dynamics, a novel policy iteration approach is proposed, which leverages integral reinforcement learning (RL) techniques to iteratively solve the stochastic algebraic Riccati equation (SARE) using real‐time state and input data. The proposed approach is an off‐policy RL algorithm, where the learning process can be executed by using identical state and input data collected online over fixed time intervals, thereby enabling the optimal control law to be computed. The convergence of the proposed algorithm to the solution of the SARE is verified, and the effectiveness is validated through a numerical example.