Automatic post-fuzzing triage and automation using REVEN
As those of you lucky enough to attend the great OffensiveCon 2022 might be aware, we are developing a Fuzzing & Triage platform based off REVEN Enterprise Edition.
The platform aims at automating root cause analysis by deriving advanced information about crashes found by a fuzzer, such as the origin of the crash (which bytes from the input files are directly causing the crash), the underlying vulnerability (buffer overflow, use-after-free…) and the coverage information for the crash.
The platform comes with a visualizer to display a report for each crash, with the ability to open the corresponding scenario for analysis with Axion (REVEN’s GUI).
How it works
The way the platform works can be summarized with the following diagram:
The platform can watch any directory for crash files. For each crash file appearing in the watched directory, the executor component of the platform will launch several steps:
- Record a fresh REVEN scenario from the test harness + the input file causing the crash. The record uses the workflow API that’s available in the Enterprise Edition of REVEN, to automatically start recording at the beginning of the binary’s execution, and stop when it completes (or crashes, as is generally the case when using inputs from a fuzzer).
- Replay the recorded scenario, still using REVEN’s workflow Python API.
- Analyze the replayed scenario, using REVEN’s analysis Python API:
- Find the crash point by searching for
ntdll!KiUserExceptionDispatchwith the call search. We also extract interesting information about the crash, such as the kind of crash (division by zero, page fault, etc).
- Find the origin of the data causing the crash, using semantic tainting in the backward direction. What is semantic tainting? It is a layer built on top of REVEN’s taint that allows to find higher-level information about the data flow. More about semantic tainting in the next section.
- Further minimize the number of “unique crashes” from the initial number reported by AFL, deriving a hash from the backtrace at the crash point and the backtraces of the origin points.
- Find the crash point by searching for
The current status and the final report for each crash file can be monitored live in the visualizer view, a web page locally served by the platform.
The taint engine of REVEN is pretty unique in that it can follow the dataflow in the backward direction, essentially answering the question: “Where is this data coming from?”. A detailed introduction to REVEN’s taint can be found in this previous article.
However the core taint engine focusses on giving the low-level information of which registers/memory areas are tainted, and which instructions the dataflow goes through.
To provide a more meaningful origin for crashes, such as the bytes in the input file responsible for the crash, or the allocated buffer causing the crash, we need to augment this low-level information with semantic information.
To achieve this goal, we look for specific, known functions in the backtrace while iterating over the tainted instructions. When a known function is hit, we then look if its return value or arguments are currently tainted. If it is the case, we can ascribe a meaning to that event, depending on what is tainted and on the function.
Here are a few examples:
- The return value of
mallocis tainted. This means that the tainted data originates with an allocated buffer, of size described by
- Part of the
NtReadFileis tainted. This means that the tainted data originates with the file pointed by the
FileHandleargument. The precise offsets in the file can be found with some black magic involving reading the kernel structures to retrieve information about the handle.
We applied semantic tainting to a toy library with known vulnerabilities that we used as a fuzzing target, as well as on several crash files for vulnerable versions of
Experimenting with Semantic tainting on a toy library
Here are some results from these tests:
On the toy library, we found a divide by zero error:
UID d60b2c50563601432424993b1a139546130b209825dc2e27ddc46a09ce975ab6 Type DIVIDE_ERROR Desc Division by '0' ('rax') Transition #2094654 divide error while executing div rax Faulted data origin(s) [File at #53733] 'C:\reven\id_00000,sig_08,src_000000,time_0,execs_29,op_havoc, rep_4' at offset(s): [20:20]
Semantic tainting retraces the divide error to the 20th byte in the input file that is the source of the crash.
Another interesting case is this data pagefault:
UID 44f18979b27dcf4ecf54234da716026aafb7d2ddcf58926653284e9c3d0af81e Type DATA_PAGEFAULT Desc Pagefault when accessing 'ds:0x16922987000' ('[rax]') Transition #4154895 page fault while executing movzx edx, byte ptr ds:[rax] Faulted data origin(s) 1. [File at #3321421] 'C:\reven\id_000006,sig_08,src_000000,time_31,execs_75,op_havoc,rep 8' at offset(s): [0:7] 2. [Allocation at #3320437] ‘malloc(116) = 0x16922981490'
We can see that the value comes both from bytes 0 to 7 of the input file, and from an allocation of size 116 that returned the pointer
0x16922981490. The pagefault was caused by accessing
0x16922987000, which is out of the bounds of the allocated object.
Applying Semantic tainting on
clang-format crashes out of OSS-Fuzz
- Our first
clang-formatcrash was the following:
- source: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=44956
- repro on Linux:
Debian clang-format version 13.0.1-3from Debian Bullseye (stable)
- repro on Windows:
clang version 12.0.0from https://llvm.org/builds/
- testcase: https://oss-fuzz.com/download?testcase_id=4821442364047360
- crash type: stack-overflow
- date: Tue, Feb 22, 2022 Running the analysis generated the following report:
UID None Type DATA_PAGEFAULT Desc DATA PAGEFAULT at #284863280 Transition #284863280 page fault while executing call 0x7ffac63ad4f8 Faulted data origin(s) Unknown
We can see that the crash occurs because of a data pagefault while executing a call, pointing towards a stack overflow.
clang-formatcrash was from OSS-Fuzz too, and generated the following report:
UID 5bf7818a528b310fd9066d22ef1cc1e538b5d43599131a2506d84ece60cdf10f Type DATA_PAGEFAULT Desc Pagefault when accessing 'ds:0x38' ('[rcx+0x38]') Transition #12390412 page fault while executing mov eax, dword ptr [rcx+0x38] Faulted data origin(s) [Code at #12369993] 'mov qword ptr [rax+0xc0], 0x0'
This time the data page fault is retraced by the analysis to an instruction where the immediate
0is written to memory. This means that the page fault is cause by dereferencing a (quasi-)null pointer. The fact that it is an immediate tells us that the user cannot control this value.
The Fuzzing & Triage platform currently in development at Tetrane brings both integration with fuzzing tool chains and advanced automated crash root cause analysis capabilities thanks to a semantic layer on top of REVEN’s taint engine. We have tested it both on a specifically crafted toy library and on recent actual crashes detected by OSS-Fuzz on
The first results look promising and adding this semantic layer is an exciting development that we will carry on in the future!
Coming Soon: Join the beta!
Our current goal is to release a first iteration of REVEN’s Fuzzing & Triage platform in the next version of REVEN Enterprise Edition, hardening the platform by testing it on more targets, making it more flexible (such as allowing the user to plug their own analysis and generate customized reports), displaying more information… A beta program is open for current Enterprise customers. Get in touch for early access.