Abstract Deep learning techniques have achieved remarkable performance in medical image segmentation through supervised methods; however, these methods require extensive data annotation, which is both time-consuming and labor-intensive. Although semi-supervised methods using consistency regularization and pseudo-labeling techniques have shown good performance in medical image segmentation, there are still issues with poor quality of pseudo-labels and excessive reliance on the quality of unlabeled data. To address this, this study proposes a pseudo-labeling framework based on random region cropping data augmentation. By enhancing the reliability of pseudo-label boundary regions through dual perturbations at both the model and structural levels, the network is forced to focus on the local-global contextual relationships of anatomical structures. Additionally, a dual-branch mean teacher architecture is constructed, using a shared encoder to extract common semantic representations from labeled and unlabeled data. A heterogeneous decoder introduces structured perturbations, compelling the model to learn a consistent semantic representation from a differentiated feature space, combined with a dynamic confidence threshold to achieve adaptive correction of noisy pseudo-labels. Experiments conducted on the ACDC cardiac magnetic resonance imaging and left atrium segmentation dataset reveal that when the proportion of labeled data decreases to 10%, the proposed method achieves Dice coefficients of 88.7% ± 1.2 and 89.3% ± 0.8. This represents an improvement of 3.9% and 4.2% over existing methods such as UAMT and DCT. Furthermore, the method presented in this paper also exhibits the best performance in semi-supervised segmentation with 20% labeled data.