Controlling both the spectral and spatial degrees of freedom of broadband light in multimode fibers (MMFs) is essential for advanced imaging and information transport, but complex modal coupling in MMFs hinders conventional wavefront shaping techniques. In this work, we present a deep learning–based approach that achieves robust spatio-spectral control in MMFs. Our method employs Actor-Critic–inspired networks trained to reverse modal scrambling using only amplitude detection at the output. The system combines a digital micromirror device for amplitude modulation and a spatial light modulator for phase modulation, exploiting their complementary functions. Experimentally, we generate complex broadband patterns with high fidelity, achieving a Pearson correlation of 0.9 across a >50 nm bandwidth in the near-infrared telecommunications band. The method further enables parallel control for simultaneous multispectral focusing. These results open new opportunities for multi-color image projection, hyperspectral endoscopy, and broadband wavefront engineering in complex media.