Purpose Path information in knowledge graphs can provide explicit explanations for recommendation decisions, thus becoming a focus in explainable recommendation research. However, limited studies about explainable recommendation problems result in low model transparency and poor persuasiveness, affecting user experience. Therefore, the goal is to provide accurate and interpretable recommendations for recommendations. Design/methodology/approach This study proposes a recommendation reasoning method based on knowledge graph and reinforcement learning. To alleviate the noise problem in the state space, a multi-head attention mechanism is used to learn state expressions. A dual critic network is used to optimize long-term and short-term rewards simultaneously, achieving path reasoning in the knowledge graph to provide short-term and long-term value explanations for recommendation decisions. Findings We conduct extensive experiments on the real-world benchmark dataset and the domain dataset to validate the effectiveness of our method on the recommendation and explanation tasks, which proves the method's ability to generate high-quality and interpretable course recommendations. Originality/value Developing explainable recommendation methods based on the combination of knowledge graph and reinforcement learning is crucial to overcome the current limitations. A recommendation system that integrates knowledge reasoning, autonomous learning and interpretability may meet the needs of the modern education field for explainable recommendation systems.