Dagstuhl Seminar 14241: Challenges in Analysing Executables: Scalability, Self-Modifying Code and Synergy

Dagstuhl Seminar 14241

Challenges in Analysing Executables: Scalability, Self-Modifying Code and Synergy

( Jun 09 – Jun 13, 2014 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/14241

Organizers

Roberto Giacobazzi (University of Verona, IT)
Axel Simon (TU München, DE)
Sarah Zennou (Airbus Group - Suresnes, FR)

Contact

Susanne Bach-Bernhard (for administrative matters)

Publications

Challenges in Analysing Executables: Scalability, Self-Modifying Code and Synergy (Dagstuhl Seminar 14241). Roberto Giacobazzi, Axel Simon, and Sarah Zennou. In Dagstuhl Reports, Volume 4, Issue 6, pp. 48-63, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2014)

Motivation

Show Motivation

The seminar "Challenges in Analyzing Executables: Scalability, Self-Modifying Code and Synergy" addresses the analysis of executable code and unites people from a multitude of backgrounds such as auditing, verification, transformation, malware detection and other areas. The analysis of executables becomes increasingly popular as it poses new challenges to the academic world and addresses a pressing need in industry. The seminar is motivated by the earlier Dagstuhl seminar 12051 and addresses three major challenges, namely: the scalability of analyses, the ability to handle self-modifying code and how to create synergy between different communities by combining each other's analyses to create more powerful tools.

Scalability

The translation from byte sequences that represent the code of a program to the instruction semantics poses particular scalability issues over the analysis of the high-level source code: A single line of source code translates to several assembler instructions. Each instruction, in turn, is then translated to a semantic description. This description is usually expressed by a small intermediate language (IL) that requires the effects of a single assembler instruction to be expressed using several IL statements. Overall, a single line of high-level code may turn into tens of IL instructions that have to be analyzed. Other, more subtle, forms of performance issues exist. Identifying and addressing these issues is the "Scalability" challenge.

Self-Modifying Code

In contrast to a source code analysis, elementary program concepts such as the control flow graph, loops, local variables, stack frames of functions, etc. are no longer available and therefore have to be recovered from the code. For instance, the reconstruction of the control flow graph (CFG) is non-trivial as instructions may change during execution and jumps and calls to computed addresses can only be resolved by estimating which values the computation may yield. The latter requires both a model of code that may change its shape and structure at run-time as well as a value analysis which is commonly expressed as a fix-point computation on the control-flow graph. This chicken-and-egg problem has been addressed by several authors but more challenges are as-of-yet unresolved: many programs, especially malicious ones, contain themselves interpreters, JIT compilers, and more generic forms of code generation that blurs the concept of the code that is to be analyzed. We call this challenge "Self-Modifying Code".

Synergy

The previous Dagstuhl Seminar 12051 on the analysis of executable code brought together researchers and practitioners working on executable programs. Many participants were surprised by the diversity of tasks that can only or best be addressed at the binary level. Some of these tasks were: the verification of worst-case execution time, proving the absence of run-time errors, reverse engineering of legacy software for code re-use, identifying which security issues are addressed by a software update by performing graph matching on the CFG of the previous and the new version, summarizing sequences of basic blocks for better analysis speed and precision, devising techniques to manage integer overflows. Indeed, the seminar brought together several research and industrial communities that face common problems. One vision of this new seminar is to create "Synergy" and collaboration between these communities by asking how a tool of one community can be used in the context of another community.

Summary

Show Summary

As a follow-up on the previous Dagstuhl Seminar 12051 on the analysis of binaries, the interest in attending this new seminar was very high. In the end, less than half the people that we considered inviting could attend, namely 44 people. In contrast to the previous seminar that ran for 5 days, this seminar was a four-day seminar due to a bank holiday Monday. Having arranged the talks by topic, these four days split into two days on the analysis of binaries and into (nearly) two days on obfuscation techniques.

The challenges in the realm of general binary analysis have not changed considerably since the last gathering. However, new analysis ideas and new technologies (e.g. SMT solving) continuously advance the state-of-the-art and the presentations where a reflection thereon. With an even greater participation of people from industry, the participants could enjoy a broader view of the problems and opportunities that occur in practice. Given the tight focus on binary code (rather than e.g. Java byte code), a more detailed and informed discussion ensued. Indeed, the different groups seem to focus less on promoting their own tools rather than seeking collaboration and an exchange of experiences and approaches. In this light, the seminar met its ambition on synergy. It became clear that creating synergy by combining various tools is nothing that can be achieved in the context of a Dagstuhl Seminar. However, the collaborative mood and the interaction between various groups give hope that this will be a follow-on effect.

The second strand that crystallized during the seminar was the practical and theoretic interest in code obfuscation. Here, malware creators and analysts play an ongoing cat-and-mouse game. A theoretic understanding of the impossibility of winning the game in favor of the analysts helps the search for analyses that are effective on present-day obfuscations. In practice, a full understanding of some obfuscated code may be unobtainable, but a classification is still possible and useful. The variety of possible obfuscations creates many orthogonal directions of research. Indeed, it was suggested to hold a Dagstuhl Seminar on the sole topic of obfuscation.

One tangible outcome of the previous Dagstuhl Seminar is our GDSL toolkit that was presented by Julian Kranz. We believe that other collaborations will ensue from this Dagstuhl Seminar, as the feedback was again very positive and many and long discussions where held in the beautiful surroundings of the Dagstuhl grounds. The following abstracts therefore do not reflect on the community feeling that this seminar created. Please note that not all people who presented have submitted their abstracts due to the sensitive nature of the content and/or the organization that the participants work for.

Creative Commons BY 3.0 Unported license

Roberto Giacobazzi, Axel Simon, and Sarah Zennou

Participants

Show Participants

Davide Balzarotti (EURECOM - Biot, FR) [dblp]
Sébastien Bardin (CEA LIST, FR) [dblp]
Frédéric Besson (IRISA - Rennes, FR) [dblp]
Sandrine Blazy (IRISA - Rennes, FR) [dblp]
Juan Caballero (IMDEA Software - Madrid, ES) [dblp]
Lorenzo Cavallaro (RHUL - London, GB) [dblp]
Aziem Chawdhary (University of Kent, GB) [dblp]
Cory Cohen (Software Engineering Institute - Pittsburgh, US) [dblp]
Mila Dalla Preda (University of Verona, IT) [dblp]
Bjorn De Sutter (Ghent University, BE) [dblp]
Saumya K. Debray (University of Arizona - Tucson, US) [dblp]
David Delmas (Airbus S.A.S. - Toulouse, FR) [dblp]
Thomas Dullien (Google Switzerland, CH) [dblp]
Emmanuel Fleury (University of Bordeaux, FR) [dblp]
Anthony Fox (University of Cambridge, GB) [dblp]
Roberto Giacobazzi (University of Verona, IT) [dblp]
Kathryn E. Gray (University of Cambridge, GB) [dblp]
Paul Irofti (Bucharest, RO) [dblp]
Yan Ivnitskiy (Trail of Bits Inc. - New York, US) [dblp]
Andy M. King (University of Kent, GB) [dblp]
Tim Kornau-von Bock und Polach (Google Switzerland, CH) [dblp]
Julian Kranz (TU München, DE) [dblp]
Colas Le Guernic (Direction Generale de l'Armement, FR) [dblp]
Junghee Lim (GrammaTech Inc. - Ithaca, US) [dblp]
Alexey Loginov (GrammaTech Inc. - Ithaca, US) [dblp]
Federico Maggi (Polytechnic University of Milan, IT) [dblp]
Jean-Yves Marion (LORIA - Nancy, FR) [dblp]
Florian Martin (AbsInt - Saarbrücken, DE) [dblp]
Isabella Mastroeni (University of Verona, IT) [dblp]
Bogdan Mihaila (TU München, DE) [dblp]
Magnus Myreen (University of Cambridge, GB) [dblp]
Gerald Point (University of Bordeaux, FR) [dblp]
Edward Robbins (University of Kent, GB) [dblp]
Bastian Schlich (ABB AG Forschungszentrum Deutschland - Ladenburg, DE) [dblp]
Alexander Sepp (TU München, DE) [dblp]
Axel Simon (TU München, DE) [dblp]
Aditya Thakur (University of Wisconsin - Madison, US) [dblp]
Axel Tillequin (Airbus Group - Suresnes, FR)
Franck Védrine (CEA - Gif sur Yvette, FR) [dblp]
Aymeric Vincent (University of Bordeaux, FR) [dblp]
Xueguang Wu (TU München, DE) [dblp]
Brecht Wyseur (NAGRA Kudelski Group SA - Cheseaux, CH) [dblp]
Stefano Zanero (Polytechnic University of Milan, IT) [dblp]
Sarah Zennou (Airbus Group - Suresnes, FR) [dblp]

Related Seminars

Dagstuhl Seminar 12051: Analysis of Executables: Benefits and Challenges (2012-01-29 - 2012-02-03) (Details)
Dagstuhl Seminar 17281: Malware Analysis: From Large-Scale Data Triage to Targeted Attack Recognition (2017-07-09 - 2017-07-14) (Details)

Classification

programming languages / compiler
semantics / formal methods
verification / logic

Keywords

executable analysis
reverse engineering
self-modifying code
malware analysis

Seminar 14241

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 14241

Challenges in Analysing Executables: Scalability, Self-Modifying Code and Synergy

( Jun 09 – Jun 13, 2014 )

Permalink

Organizers

Contact

Publications

Motivation

Scalability

Self-Modifying Code

Synergy

Summary

Participants

Related Seminars

Classification

Keywords