Chang An, an animated film blending historical narrative with poetic and painterly aesthetics, achieves cross-media transmission and cognitive reconstruction of cultural imagery. Based on conceptual metaphor theory and multimodal discourse analysis, this study explores how the synergistic interaction of visual, auditory, and linguistic modalities constructs metaphorical meaning, while analyzing its cognitive mechanisms and cultural connotations. Findings reveal that the film predominantly employs visual-textual multimodal combinations, often contrasting different representations of the same object to convey meaning. The research highlights the cognitive function of multimodal metaphors in historical-cultural films, offering theoretical insights into the modern translation of traditional cultural symbols.