Accurately predicting aortic valve movement under coronary influence is critical for personalized cardiac interventions and virtual surgery planning. Conventional Fluid-Structure Interaction (FSI) models tend to neglect modeling coronary arteries due to their complexities, leading to bias in leaflet motion simulations. To overcome this limitation, we propose a combination of spatiotemporal graph convolution and Transformer based gated (Gated-STGFormer) deep learning framework, which learns to reconstruct coronary-modulated leaflet motion from simulations without explicit coronary arteries. The framework integrates Graph Convolutional Networks (GCNs) for spatial dependency modeling and Transformer for temporal dependency, with both encoding and gating mechanisms to effectively capture spatiotemporal couplings. Quantitative evaluation demonstrated that it reproduces the spatiotemporal movements of leaflets under the coronary arteries with a high degree of fidelity. By addressing anatomical simplifications in conventional simulations, this method provides a physics-informed and computationally efficient surrogate model with strong clinical applicability. Our findings suggest that the Gated-STGFormer effectively incorporates spatiotemporal modal information, and can serve as a module for coronary artery compensation in a preoperative planning system, enabling more realistic and personalized valve biomechanical simulations.