This study presents an improved MOPSO algorithm, ATP-QL-MOPSO, for lightweight and crashworthiness optimization of automotive battery pack systems (BPS). Traditional MOPSO struggles with hyperparameter tuning and local optima. The proposed method integrates Q-learning (QL) and adaptive t-distribution perturbation (ATP) to address these issues. In QL, particles act as agents with independent velocity updates, using Euclidean distance and three velocity parameters as state and action spaces. ATP dynamically adjusts the t-distribution shape to avoid local optima. ATP-QL-MOPSO showed improved performance in ZDT tests with reduced Inverted Generational Distance (In ZDT1, IGD is reduced by 55.6% compared to standard MOPSO). Applied to BPS, it achieved higher hypervolume values, increased X -direction displacement by 6.044%, decreased Y -direction displacement by 2.787%, and reduced mass by 3.273%. This demonstrates that QL automates hyperparameter tuning, while ATP improves convergence, making it superior to MOPSO in optimizing BPS for weight reduction and crash resistance, with potential for broader applications.