Automatic modulation recognition (AMR) is essential for ensuring the physical-layer security for Internet of things (IoT) networks. Despite advancements in deep learning, most current AMR methods rely heavily on a large number of labeled samples to achieve high recognition accuracy. However, acquiring labeled samples can be costly and impractical in many real-world scenarios due to privacy concerns and economic constraints. In contrast, unlabeled data is often abundant and readily available. This paper presents a novel semi-supervised AMR framework that addresses the challenge of label scarcity by leveraging semantic mask contrastive learning (SMCL). Through a self-supervised modulation semantic mask contrastive prediction task within IQ sequence, our method learns subtle modulation features directly from unlabeled radio signals. It is important to note that SMCL requires neither data augmentation nor representation domain transformation. Sufficient experiments on public datasets have demonstrated our method outperforms existing semi-supervised and supervised methods when using the same number of labeled samples. SMCL effectively enables the representation learning of unlabeled radio signals, overcoming the limitations posed by the lack of sufficient labeled data and providing a solid technical foundation for the development of signal-based IoT large language models (IoT-LLMs).