计算机科学
序列化
编码(集合论)
代表(政治)
二进制数
二进制代码
杠杆(统计)
透视图(图形)
人工神经网络
理论计算机科学
人工智能
程序设计语言
数学
算术
集合(抽象数据类型)
政治
政治学
法学
作者
Guoqiang Chen,Han Gao,Jie Zhang,Ying He,Shiduan Cheng,Weiming Zhang
标识
DOI:10.1109/pst58708.2023.10320193
摘要
Building a model to reassign descriptive names for binary functions is considerable assistance for reverse engineering. Existing methods proposed for this issue are based on the low-level representation of binary code (e.g., assembly code), and especially the recent approaches employed neural-based models on instruction sequences. However, their performance is still unsatisfactory. Meanwhile, modern decompilers provide lifted representations of binary code, and their effectiveness has not been adequately studied. This paper further explores the issue of function name reassignment from the perspective of binary code representation. Specifically, we present a general and flexible NEural-based function name Reassignment framework NER, which leverages a decompiler to obtain a specific representation and applies the corresponding serialization strategy on it. NER then uses an alternative neural network to make predictions. Three levels of representation are investigated, including assembly code, Intermediate Representation (IR), and pseudo-code. We observe the binary code representations are significant for the final performance. It demonstrates that the pseudo-code is the most effective one. Based on these findings, we leverage the framework to implement a reassignment model NER-pc, which has 25% and 10% F1 score improvements against the state-of-the-art methods. Besides, more experiments are conducted to verify the design of NER and the effectiveness of NER-pc.
科研通智能强力驱动
Strongly Powered by AbleSci AI