Follicular thyroid carcinoma (FTC) is the second most common thyroid cancer. Preoperative differentiation between benign and malignant follicular tumors remains challenging using ultrasound and fine needle aspiration biopsy (FNAB). Radiomics quantitatively evaluates diseases by extracting and analyzing features from medical images. This study aimed to assess the diagnostic value of ultrasound radiomics in distinguishing follicular thyroid carcinoma (FTC) from follicular thyroid adenoma (FTA) among TI-RADS 4a nodules. A retrospective analysis was conducted on the ultrasound images from 144 patients with TI-RADS 4a follicular thyroid neoplasms who underwent their first surgery in our hospital from January 2018 to June 2024. First, ultrasonographic characteristics (US) were analyzed from ultrasound images and diagnostic reports to build a US model. Second, ultrasound radiomics features were extracted from ultrasound images by the software of 3D-Slicer. According to the postoperative pathological results, the patients were divided into FTC group and FTA group. Following the principle of random allocation, the ratio of the training group ( n = 86) to the validation group ( n = 58) was 6:4. The ultrasound radiomics features were selected by the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm in order to build a radiomics model. Finally, a combined model integrating ultrasonographic characteristics and radiomics features (combined-model) was developed. All models including US model, radiomics model and combined-model were built through multi-factor logistic regression analysis to differentiate and diagnose follicular thyroid neoplasms. The receiver operating characteristic curve (ROC), precision, recall and F1-Score were used to evaluate the efficacy of the models. One hundred forty-four patients with TI-RADS 4a follicular thyroid neoplasms were divided into FTC group (41 cases) and FTA group (103 cases) based on postoperative pathological results. A total of 858 ultrasound radiomics features were extracted from the ultrasound images. After screening, six optimal radiomics features were obtained. Among the three models, the combined-model demonstrated best performance in differentiating FTC from FTA, with the area under the curve (AUC) of 0.839 (95% CI: 0.663–1.000) in the validation group. The F1-Score reflected a balance between precision and recall, with overall performance being superior. Combined model of ultrasonographic characteristics and radiomics may be useful to distinguish FTC from FTA.