27C3 - Version 1.6.3

27th Chaos Communication Congress
We come in peace

Day Day 4 - 2010-12-30
Room Saal 2
Start time 13:45
Duration 00:30
ID 3983
Event type Lecture
Track Hacking
Language used for presentation English

Hackers and Computer Science

What hacker research taught me

Although most academics and industry practitioners regard "hacking" as mostly ad-hoc, a loose collection of useful tricks essentially random in nature, I will argue that hacking has in fact become a "distinct research and engineering discipline" with deep underlying engineering ideas and insights. Although not yet formally defined as such, it are these ideas and insights that drive the great contributions that hacking has been making to our understanding of computing, including the challenges of handling complexity, composition, and security in complex systems. I will argue that hacking uncovers and helps to understand (and teach) fundamental issues that go to the heart of Computer Science as we know it, and will try to formulate several such fundamental principles which I have learned from hacker research.

At some point I realized that I was learning more about what really matters in computer science from hacker conventions, Phrack, Uninformed, and other hacker sources than from any academic source. Moreover, it wasn't just about exploits and vulnerabilities, it was about how systems were really designed, as opposed to how developers thought and students were taught they were. Then I realized that the reason for vulnerabilities that kept on giving were quite deeply theoretical and involved, e.g., theory of computation and information theory. Very little of this was quoted or understood in the academic publications.

In this talk I will give a retrospective of hacker research that I believe has priority in developing or rediscovering important scientific ideas that define the discipline of information security. I will also argue that "hacking" has de-facto become a distinct engineering discipline with its own special methods and principles.

  1. Hackers invented "cross-layer analysis" methodology.

Hackers deal with trust properties of software, hardware, and human-computer systems. Other disciplines claim to study these properties, but hacker approach is radically different: while others focus of layered designs (like the OSI networking model or the application-library-system calls-kernel internals-drivers), hacking is essentially cross-layer.

In a word, hackers essentially invented security analysis that focuses on trust assumptions in interactions between layers in multi-layer composed systems. It is not formalized to academic standards, but it's there and it delivers.

In retrospect, the emergence of an engineering discipline that analyzed the trust effects of the prevaling practical method of developing complex computer systems - composing them out of mostly independently developed and tested components - was to be expected. Such analysis was sorely needed and would arise even if these issues were continually ignored by the recognized experts and authorities. As it stands, hacker research arose to fill this gap.

Whereas for a regular developer the layers below or above are implicitly trusted to behave as specified (or at least as described in tutorials), hackers focus on how the layers actually interact, and how data and control actually flow through the layers.

That is why in-depth rootkits (like Phrack 59:5, palmers@team-teso and similar) are the best possible reading for understand OS structure. I assign it in all my OS-related courses. That is why session hijacking through packet injection and other "Black Ops" a-la Ptacek, Newsham, and Kaminsky is best for understanding how TCP/IP stacks work: it shows exactly what data structures are kept by kernels to present the illusion of a session over stateless IP, and how they are controlled.

  1. "Security is not Composable"

Many of the best hacks are based on an interplay between composition of relatively secure parts resulting in new properties that make the composed system insecure. To a large extent, Hacking as a research discipline is a study of this phenomenon (which I will illustrate with examples from networks and systems).

From the theory standpoint, for a system that is a composition of several sufficiently complex parts, there is no general algorithm or formal method to deduce properties of the composed system even if properties of parts are known. (This can be reduced to the Halting Problem or, equivalently, to Rice's Theorem).

I will give examples of insecure system design based on simple assumptions about the compositional properties that cannot actually be verified in any algorithmic way -- and, as a result, these systems remain badly broken despite efforts to fix them. These systems will remain untrustworthy and yielding 0-days because their security is based on an undecidable problem that no amount of programming can solve.

A prime example - taken to the next level of rigor in a major break-through by Len Sassaman and Meredith Patterson announced this year - is the case of X.509 certificates. It depends on the computations involved in parsing of the certificate signing request and the signed certificate by the CA and the browser respectively being equivalent (i. e., yielding the same result). However, the design of the X.509 protocol makes it computationally impossible to check this equivalence - and so the certificate system will keep on giving. Mostly likely, there is no "good enough" solution, unless a well-defined intermediary class of data structures involved is defined.

Another example of the same thing is NIDS traffic re-assembly. The efficacy of NIDS depends on the re-assembly computation giving the same result on the NIDS and the target - and this is, in general, also computationally impossible (undecidable), with no "good enough" solution either.

  1. Hackers re-define what "computation" means.

Theory of computation concerns itself with proving what computing environments can and cannot do - e. g., that regular expressions cannot parse recursive structures (not that it is not being attempted, the last high-profile example being IE's regexp-based anti-XSS rewriting feature which gave rise to a whole new kind of exploits precisely because of this inability).

Designers and developers base security of their systems on similar trust assumptions of what the systems can and cannot do. The problem is, they do not understand what computations their systems are actually capable of.

From the early days of Aleph1 stack-smashing, hacking has a great history of exposing extra computational power in systems, and in demonstrating "weird" computations that were thought to be impossible. Hackers have been exposing unexpected "weird machines" actually contained in traditional computing environments, which supported unexpected programming models, with exploits as their "programs".

Exploit development as a discipline is about recognizing a "weird machine" inherent in the target (which is usually Turing-complete or close) and writing code for it (in its "weird" instruction set, which may include calls to library functions, system calls, or just reachable pieces code from app to firmware).

It's all about programming "platforms within platforms within platforms".

A prime example is the story of developing the "return-to-known code" idea, from the original "return-to-libc" exploits first published by Solar Designer in 1997, through the detailed technique explanation in Phrack 2000-2001 (58:4, 59:5), to the recent fully featured compiler for kernel rootkits by Hund, Holz, and Freiling. The idea of such compilers, however, has been circulated in the hacker community since at least 2001, as I will show.

It took about ten years for academia to recognize the power of this idea, which got named "Return-oriented Programming" or "ROP" in 2008.

The idea that an exploit - that is, an actual program in its own right, executing on a "weird machine" - could contain *no* native executable code, and thus instead of looking for "malicious code" one had to watch out for "malicious computation" (this term itself coined in 2008) - would have been impossible without hacker research.

In short, hacker research had re-defined the very idea of computation.

  1. DoD "Orange Book" ideas re-born in hacker OS hardening patches.

I will show how the classic 1970s ideas about building secure systems, such as those of the DoD "Rainbow series" and Tagged architectures, while long ignored by the commodity computing vendors like Intel, have been re-born in hacker hardening patches such as OpenWall, PaX, grsecurity patches, etc., and other creative uses of the x86 segmentation system, extra page table entry bits, and split TLB.

I will give a historical perspective of these advances, from OpenWall to ShadowWalker and beyond.

I will argue that it was hacker research and hacker proof-of-concepts that finally caused the industry to recognize the value of and implement hardware NX protection and introduce NX-based features like DEP into mainstream OS (I am indebted to FX for major parts of this argument).

  1. The hacker development of the debugger into a Turing-complete environment.

It is a fact that hackers (and, in particular, vuln dev and RE communities) have been the leading producer of debuggers ever increasing in power, and have in fact changed the very idea of the conventional debugger, by making it into a Turing-complete environment.

This research involved a deeper understanding of how to use hardware trapping - including how to trap complex events such as "a page that has been recently written to by a user process was used to fetch an instruction from" (OllyBone).

It opened new directions for security policy mechanisms, reference monitor design, etc., in both academia and industry (of which I will give examples).

  1. Trust relationships as first-class networking objects.

I will describe how hacker research into network deceptions and trust relationship mapping in networks created the methodology and the industry of network security assessment.