Penetration testing is widely acknowledged as the foremost method for evaluating network security. However, three challenges impede the generation of strategies that align with human expectations. In this article, we present, for the first time, a method based on human feedback to enhance strategy generation. Our approach comprises two components: agent training and decision-making. During agent training, we establish a hierarchical framework to decompose tasks and a knowledge base to offer advice for improving data efficiency. We then impose constraints on the action space to mitigate ineffective exploration. Finally, we train a reward model based on human feedback and fine tune the model guided by this reward model. In decision-making, we process the model output to enhance decision accuracy. We crafted scenarios based on real-world networks, and the results demonstrate the effectiveness of our method in generating penetration testing strategies that align more closely with human intentions.