Automatic post-fuzzing triage and automation using REVEN


Mar 08, 2022
by Louis ,Quentin and Mickaël
Categories: Technical -
Tags: Reverse Engineering - REVEN - Fuzzing - AFL - Workflow API -




As those of you lucky enough to attend the great OffensiveCon 2022 might be aware, we are developing a Fuzzing & Triage platform based off REVEN Enterprise Edition.

The platform aims at automating root cause analysis by deriving advanced information about crashes found by a fuzzer, such as the origin of the crash (which bytes from the input files are directly causing the crash), the underlying vulnerability (buffer overflow, use-after-free…) and the coverage information for the crash.

The platform comes with a visualizer to display a report for each crash, with the ability to open the corresponding scenario for analysis with Axion (REVEN’s GUI).

The visualizer

How it works

The way the platform works can be summarized with the following diagram:

Fuzzing platform step diagram

The platform can watch any directory for crash files. For each crash file appearing in the watched directory, the executor component of the platform will launch several steps:

  1. Record a fresh REVEN scenario from the test harness + the input file causing the crash. The record uses the workflow API that’s available in the Enterprise Edition of REVEN, to automatically start recording at the beginning of the binary’s execution, and stop when it completes (or crashes, as is generally the case when using inputs from a fuzzer).
  2. Replay the recorded scenario, still using REVEN’s workflow Python API.
  3. Analyze the replayed scenario, using REVEN’s analysis Python API:
    1. Find the crash point by searching for ntdll!KiUserExceptionDispatch with the call search. We also extract interesting information about the crash, such as the kind of crash (division by zero, page fault, etc).
    2. Find the origin of the data causing the crash, using semantic tainting in the backward direction. What is semantic tainting? It is a layer built on top of REVEN’s taint that allows to find higher-level information about the data flow. More about semantic tainting in the next section.
    3. Further minimize the number of “unique crashes” from the initial number reported by AFL, deriving a hash from the backtrace at the crash point and the backtraces of the origin points.

The current status and the final report for each crash file can be monitored live in the visualizer view, a web page locally served by the platform.


Semantic tainting

The taint engine of REVEN is pretty unique in that it can follow the dataflow in the backward direction, essentially answering the question: “Where is this data coming from?”. A detailed introduction to REVEN’s taint can be found in this previous article.

However the core taint engine focusses on giving the low-level information of which registers/memory areas are tainted, and which instructions the dataflow goes through.

To provide a more meaningful origin for crashes, such as the bytes in the input file responsible for the crash, or the allocated buffer causing the crash, we need to augment this low-level information with semantic information.

To achieve this goal, we look for specific, known functions in the backtrace while iterating over the tainted instructions. When a known function is hit, we then look if its return value or arguments are currently tainted. If it is the case, we can ascribe a meaning to that event, depending on what is tainted and on the function.

Here are a few examples:

  • The return value of malloc is tainted. This means that the tainted data originates with an allocated buffer, of size described by malloc’s parameter.
  • Part of the Buffer argument to NtReadFile is tainted. This means that the tainted data originates with the file pointed by the FileHandle argument. The precise offsets in the file can be found with some black magic involving reading the kernel structures to retrieve information about the handle.

We applied semantic tainting to a toy library with known vulnerabilities that we used as a fuzzing target, as well as on several crash files for vulnerable versions of clang-format.

Experimenting with Semantic tainting on a toy library

Here are some results from these tests:

  • On the toy library, we found a divide by zero error:

    UID
      d60b2c50563601432424993b1a139546130b209825dc2e27ddc46a09ce975ab6
    
    Type
      DIVIDE_ERROR
    
    Desc
      Division by '0' ('rax')
    
    Transition
      #2094654 divide error while executing div rax
    
    Faulted data origin(s)
      [File at #53733] 'C:\reven\id_00000,sig_08,src_000000,time_0,execs_29,op_havoc, rep_4' at offset(s): [20:20]
    

    Semantic tainting retraces the divide error to the 20th byte in the input file that is the source of the crash.

  • Another interesting case is this data pagefault:

    UID
      44f18979b27dcf4ecf54234da716026aafb7d2ddcf58926653284e9c3d0af81e
    
    Type
      DATA_PAGEFAULT
    
    Desc
      Pagefault when accessing 'ds:0x16922987000' ('[rax]')
    
    Transition
      #4154895 page fault while executing movzx edx, byte ptr ds:[rax]
    
    Faulted data origin(s)
      1. [File at #3321421] 'C:\reven\id_000006,sig_08,src_000000,time_31,execs_75,op_havoc,rep 8' at offset(s): [0:7]
      2. [Allocation at #3320437] ‘malloc(116) = 0x16922981490'
    

    We can see that the value comes both from bytes 0 to 7 of the input file, and from an allocation of size 116 that returned the pointer 0x16922981490. The pagefault was caused by accessing 0x16922987000, which is out of the bounds of the allocated object.

Applying Semantic tainting on clang-format crashes out of OSS-Fuzz

  • Our first clang-format crash was the following:
    UID
      None
    
    Type
      DATA_PAGEFAULT
    
    Desc
      DATA PAGEFAULT at #284863280
    
    Transition
      #284863280 page fault while executing call 0x7ffac63ad4f8
    
    Faulted data origin(s)
      Unknown
    

    We can see that the crash occurs because of a data pagefault while executing a call, pointing towards a stack overflow.

  • The second clang-format crash was from OSS-Fuzz too, and generated the following report:

    UID
      5bf7818a528b310fd9066d22ef1cc1e538b5d43599131a2506d84ece60cdf10f
    
    Type
      DATA_PAGEFAULT
    
    Desc
      Pagefault when accessing 'ds:0x38' ('[rcx+0x38]')
    
    Transition
      #12390412 page fault while executing mov eax, dword ptr [rcx+0x38]
    
    Faulted data origin(s)
      [Code at #12369993] 'mov qword ptr [rax+0xc0], 0x0'
    

    This time the data page fault is retraced by the analysis to an instruction where the immediate 0 is written to memory. This means that the page fault is cause by dereferencing a (quasi-)null pointer. The fact that it is an immediate tells us that the user cannot control this value.

Conclusion

The Fuzzing & Triage platform currently in development at Tetrane brings both integration with fuzzing tool chains and advanced automated crash root cause analysis capabilities thanks to a semantic layer on top of REVEN’s taint engine. We have tested it both on a specifically crafted toy library and on recent actual crashes detected by OSS-Fuzz on clang-format.

The first results look promising and adding this semantic layer is an exciting development that we will carry on in the future!

Coming Soon: Join the beta!

Our current goal is to release a first iteration of REVEN’s Fuzzing & Triage platform in the next version of REVEN Enterprise Edition, hardening the platform by testing it on more targets, making it more flexible (such as allowing the user to plug their own analysis and generate customized reports), displaying more information… A beta program is open for current Enterprise customers. Get in touch for early access.

Inspired by what you saw? Feel free to ask any question or give feedback on our Community Github, our Discord channel or Twitter!

Next post: A tour of the Rust and C++ interoperability ecosystem
Previous post: Yes, race conditions can be detected with a single core Timeless Debugging and Analysis platform!