自然性
计算机科学
源代码
源代码行
编码(集合论)
代码库
程序设计语言
软件
物理
粒子物理学
集合(抽象数据类型)
作者
Yanjie Jiang,Hui Liu,Yuxia Zhang,Weixing Ji,Hao Zhong,Lu Zhang
标识
DOI:10.1145/3540250.3549149
摘要
Texts in natural languages are highly repetitive and predictable because of the naturalness of natural languages. Recent research validated that source code in programming languages is also repetitive and predictable, and naturalness is an inherent property of source code. It was also reported that buggy code is significantly less natural than bug-free one, and bug fixing substantially improves the naturalness of the involved source code. In this paper, we revisit the naturalness of buggy code and investigate the effect of bug-fixing on the naturalness of source code. Different from the existing investigation, we leverage two large-scale and high-quality bug repositories where bug-irrelevant changes in bug-fixing commits have been explicitly excluded. Our evaluation results confirm that buggy lines are often less natural than bug-free ones. However, fixing bugs could not significantly improve the naturalness of involved code lines. Fixed lines on average are as unnatural as buggy ones. Consequently, bugs are not the root cause of the unnaturalness of source code, and it could be inaccurate to identify buggy code lines solely by the naturalness of source code. Our evaluation results suggest that the naturalness-based buggy line detection results in extremely low precision (less than one percentage).
科研通智能强力驱动
Strongly Powered by AbleSci AI