Interprocedural analysis of low-level code

نویسنده

  • Andrea Flexeder
چکیده

Static analysis of machine code is employed for reverse engineering, automatic detection of low-level errors such as memory violations, malware detection, and many other application areas. Only at the level of executables can all errors introduced by programmers or even by compilers be identified. Analysis of machine code comes at a price: high-level language features such as local variables and procedures are no longer visible and need to be recovered. In particular, it is necessary to first reconstruct the control flow graph (CFG). This thesis tackles these challenges by presenting a sound static analysis to executables expressed using the abstract interpretation framework. Our aim is to present reasonably fast and sound analyses that scale well for industrial-sized applications. The two key contributions are as follows: First, we introduce a fully automatic and sound interprocedural analysis framework for executables. To this end, we argue for an analysis that intertwines disassembling and abstract interpretation-based analysis to provide a sound overapproximation of the CFG and additionally handles procedure calls precisely. In order to handle indirect jumps it is essential to reason about data. Hence, the location and size of variables in memory has to be inferred. Therefore we propose an analysis of differences and equalities between register contents which allows inference of potential local and global variables. In order to discharge certain assumptions made during control flow reconstruction we add an additional side-effect analysis that reasons about the modifying potential of procedures. Second, we present two novel domains: the domain of fast linear two-variable equalities and simplices, a special case of convex polyhedra. While the former domain allows a precise analysis of non-optimised assembly by inferring equalities between registers and memory locations, the latter infers precise information about memory accesses by providing linear inequality relations between loop iteration variables and memory accesses. We have implemented these analyses and experimentally evaluated that our techniques scale well for industrial-sized executables and provide very good results concerning control flow reconstruction, inference of local variables, alignment information for improving the worst-case execution time (WCET) estimation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Implementation of Interprocedural Bounded Regular Section Analysis

Optimizing compilers should produce eecient code even in the presence of high-level language constructs. However, current programming support systems are signiicantly lacking in their ability to analyze procedure calls. This deeciency complicates parallel programming, because loops with calls can be a signiicant source of parallelism. We describe an implementation of regular section analysis, w...

متن کامل

Interprocedural Control Flow Reconstruction

In this paper we provide an interprocedural algorithm for reconstructing the control flow of assembly code in presence of indirect jumps, call instructions and returns. In case that the underlying assembly code is the output of a compiler, indirect jumps primarily originate from high-level switch statements. For these, our methods succeed in resolving indirect jumps with high accuracy. We show ...

متن کامل

PIPS Is not (just) Polyhedral Software Adding GPU Code Generation in PIPS

Parallel and heterogeneous computing are growing in audience thanks to the increased performance brought by ubiquitous manycores and GPUs. However, available programming models, like OPENCL or CUDA, are far from being straightforward to use. As a consequence, several automated or semi-automated approaches have been proposed to automatically generate hardware-level codes from high-level sequenti...

متن کامل

A Compiler Infrastructure for High-Performance Java

This paper describes the zJava compiler infrastructure, a high-level framework for the analysis and transformation of Java programs. This framework provides a robust system, guaranteeing under transformations both the consistency of its internal structure and the syntactic correctness of the represented code. We address several challenges unique to Java, which have not been addressed by earlier...

متن کامل

The GRIN Project: A Highly Optimising Back End for Lazy Functional Languages

Low level optimisations from conventional compiler technology often give very poor results when applied to code from lazy functional languages, mainly because of the completely diierent structure of the code, unknown control ow, etc. A novel approach to compiling laziness is needed. We describe a complete back end for lazy functional languages, which uses various interprocedural optimisations t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011