Cardiovascular disease remains a major global health burden, highlighting the need for automated electrocardiogram (ECG) interpretation systems that are accurate, efficient, and interpretable. We present MDOT (Momentum Distillation Oscillographic Transformer), where a lightweight student model learns diagnostic classification from a teacher model enriched with physician-derived knowledge. Clinically salient electrophysiological indicators such as heart rate, QRS duration, ST segment changes, and QTc interval are integrated as auxiliary features to embed clinical reasoning. A novel OSC module converts one-dimensional ECG signals into two-dimensional oscillographic representations, enabling a Transformer backbone to capture long-range dependencies and detailed waveform morphology. Attention mechanisms further generate heatmaps that highlight diagnostically relevant segments, enhancing interpretability. On strict inter-patient splits of the MIT-BIH (8 classes) and Chapman (12 classes) datasets, MDOT achieves state-of-the-art accuracies of 99.53% and 99.03%, respectively. By combining accuracy with physician-oriented interpretability, MDOT offers a robust solution for clinical decision support and edge deployment.