The Industrial Internet of Things (IIoT) enables the realization of Industry 4.0 by enabling distributed intelligent services that change with the dynamic and real-time industrial environment. Digital twin (DT), as an emerging technology to mirror virtual replicas of physical space, has been widely used in IIoT scenarios. However, communication overhead and heterogeneous devices are still key issues limiting the development of DT modeling in IIoT. In this paper, we propose a semi-federated learning (semi-FL) based DT framework for heterogeneous IIoT scenario with large-scale devices. A federated K-means clustering algorithm is presented in this framework to could improve communication efficiency and heterogeneous processing capability through user clustering and in-cluster learning. Experimental results show that the semi-FL based DT framework has better learning accuracy and less model loss on Non-IID data compared with the existing FL-based DT framework in large-scale heterogeneous IIoT scenario.