隐写分析技术
计算机科学
网络空间
人工智能
自然语言处理
隐写术
图像(数学)
互联网
万维网
作者
H Wang,Zhongliang Yang,Jinshuai Yang,Cheng Chen,Yudong Huang
标识
DOI:10.1109/tifs.2023.3298210
摘要
Due to the widespread use of text in cyberspace, linguistic steganography, which hides secret information into normal texts, develops quickly in these years. While linguistic steganography protects users’ privacy, it also has the risk of being abused to endanger network security. Therefore, its corresponding detection technology, namely linguistic steganalysis, has attracted more and more researchers’ attention in the past several years. However, most of the current linguistic steganalysis methods rely heavily on a large number of labeled samples, which presents a significant gap from real-world scenarios where labeled steganographic samples are difficult to obtain. In this paper, we proposed the Pre-trained Language model with Self-training for Few-shot Linguistic Steganalysis (LSFLS) method which effectively copes with few-shot linguistic steganalysis through a small number of labeled samples and some auxiliary unlabeled samples. Numerous experiments have proved that the proposed method can achieve high detection accuracy of linguistic steganalysis when only a few labeled samples are provided (even less than 10), significantly improving the detection ability of existing methods in few-shot scenario. Furthermore, the experimental results demonstrate that the proposed method can maintain good detection capability in the case of data source mismatch and label unbalance. We believe that our work will greatly advance the practical application of linguistic steganalysis techniques.
科研通智能强力驱动
Strongly Powered by AbleSci AI