Announcing REVEN version 2.11
Discover Timeless Analysis Live.
Tetrane is pleased to announce the release of REVEN Enterprise, REVEN Professional and REVEN Free 2.11.
REVEN is a Timeless Debugging and Analysis (TDnA) Platform designed to go x10 faster & x10 deeper while reverse engineering. Technically, REVEN captures a time slice of a full system execution (CPU, Memory, Hardware events) to provide unique analysis features, such as Memory History or forward/backward data flow Taint, that speed up and scale the reverse engineering process.
REVEN version 2.11 is without contest one of the biggest releases of REVEN, filled to the brim with improvements to the entire workflow, from the recording process to the Analysis Python API.
This article covers these new features and tools and the other important changes introduced in this new release.
Contents
- Analyze, triage and minimize the output of a fuzzer with the Fuzzing & Triage platform
- Use debugger-assisted recording to produce short and focused scenarios
- Perform richer analyses with the Python API
- And More
Analyze, triage and minimize the output of a fuzzer with the Fuzzing & Triage platform
Fuzzing is a very central part of today’s security landscape, as it is a very efficient way to discover crashes that can lead to denial of service, and often to exploitable vulnerabilities.
With its advanced analysis capabilities, REVEN can help with fuzzing too!
This release includes a first iteration of the vision we recently presented. In REVEN Enterprise, the platform allows monitoring the crashes produced by a fuzzer. Each such crash is then automatically recorded, replayed and analyzed, ultimately performing some initial triage of the crashes.
For this first iteration, analysis is supported for Windows scenarios. Depending on the crash, it can automatically find its root cause and link to relevant portions of the input file leading to it.
In all editions, you can use the platform’s analyzer against any existing replayed scenario that was created manually.
Use debugger-assisted recording to produce short and focused scenarios
When working with REVEN, it is important to keep records short and to the point, to avoid increasing the generation time and disk use of replay data. Precise control over when to start and stop recording a scenario greatly helps in that regard.
While the enterprise edition already provides automatic recording facilities since REVEN 2.2, REVEN 2.11 includes a new integration with WinDbg at record time for all editions that enables what we call debugger-assisted recording.
While using debugger-assisted recording, you use the VM as normal to record your scenario. Then, you can break execution of the VM anytime using WinDbg, and use breakpoints on function name, among other WinDbg features. For example, this allows to setup a breakpoint, start the record once it is hit, set a second breakpoint, and stop the record on hitting the second breakpoint.
We plan to release a dedicated article on the topic of debugger-assisted debugging soon.
Perform richer analyses with the Python API
Since its release in REVEN 2.2, the Python API strived to provide reversers with efficient and useful tools for analysis. REVEN 2.11 is an important milestone, with many improvements and some major new features. The following sections present a few highlights.
A convenient API for implementing semantic tainting
Providing a taint API that’s easy to use is a hard problem, in no small part because taint itself is not a simple concept.
Our own implementation of semantic tainting in the Fuzzing Platform and our vulnerability detection scripts inspired us improvements to the API that cover common use cases we encountered time and time again:
- Easier way to check if data is tainted. For such a common operation, you can now use the
TaintState.is_tainted
method to query whether some pieces of data are tainted.>>> state = taint.state_at(trace.context_before(1000)) >>> # Checking if all of a register is tainted >>> state.is_tainted(reven2.arch.x64.rax).full False >>> # Checking if some of a register is tainted. >>> not state.is_tainted(reven2.arch.x64.rax).empty True >>> # Checking if all the requested data is tainted. >>> state.is_tainted((reven2.arch.x64.rax, reven2.arch.x64.rbx)).full False >>> # Checking if some of the requested data is tainted. >>> not state.is_tainted((reven2.arch.x64.rax, reven2.arch.x64.rbx)).empty True
- Remove/Add tainted data and then restart the taint! The
TaintState
class now providesadd/remove
methods that allow modifying the state. You can then use theTainter.taint_from_state
method to restart a new taint on the modified state.>>> state = taint.state_at(return_from_malloc_ctx) >>> # Check if rax is tainted, and remove it from the taint in one swoop >>> if state.remove(reven2.arch.x64.rax).empty: >>> # rax wasn't tainted, so the currently tainted data does not depend from this allocation >>> return >>> else: # rax tainted, restart the taint with the modified state >>> taint = tainter.taint_from_state(state)
In semantic tainting, one uses these two facilities to check e.g. that rax
is tainted at the end of malloc, and if so restart the taint after removing it (so that the internals of the allocator don’t get tainted).
Introducing: data symbols!
Previously, Binary.symbols
allowed to retrieve the function symbols available in a scenario/binary. In REVEN 2.11, this method now also returns the data symbol. This makes it easy to, for instance, locate a module’s global variable and read it.
You can use the new function_symbols
/data_symbols
sets of methods to filter according to the symbol’s nature.
Directly read structure in memory or in registers
How did we use to read filenames in a NtCreateFile
before REVEN 2.11?
>>> import reven2
>>> from reven2 import types
>>> server = reven2.RevenServer("localhost", 13370)
>>> ntoskrnl = next(server.ossi.executed_binaries("ntoskrnl"))
>>> nt_create_file = next(ntoskrnl.symbols("^NtCreateFile$"))
>>> # Finding all files that are created in a call to NtCreateFile
>>> def read_filename(ctx):
... # filename is stored in a UNICODE_STRING structure,
... # which is stored inside of an object_attribute structure,
... # a pointer to which is stored as third argument (r8) to the call
... object_attribute_addr = ctx.read(reven2.arch.x64.r8, types.USize)
... # the pointer to the unicode string is stored as third member at offset 0x10 of object_attribute
... punicode_addr = object_attribute_addr + 0x10
... unicode_addr = ctx.read(LogicalAddress(punicode_addr), types.USize)
... # the length is stored as first member of UNICODE_STRING, at offset 0x0
... unicode_length = ctx.read(LogicalAddress(unicode_addr) + 0, types.U16)
... # the buffer is stored as third member of UNICODE_STRING, at offset 0x8
... buffer_addr = ctx.read(LogicalAddress(unicode_addr) + 8, types.USize)
... filename = ctx.read(LogicalAddress(buffer_addr),
... types.CString(encoding=types.Encoding.Utf16, max_size=unicode_length))
... return filename
...
>>> for (index, ctx) in enumerate(server.trace.search.symbol(nt_create_file)):
... if index > 5:
... break
... print("{}: {}".format(ctx, read_filename(ctx)))
...
Context before #14771105: \??\C:\Windows\SystemApps\ShellExperienceHost_cw5n1h2txyewy\resources.pri
Context before #14816618: \??\PhysicalDrive0
Context before #16353064: \??\C:\Users\reven\AppData\Local\...\AC\Microsoft
Context before #16446049: \??\C:\Users\reven\AppData\Local\...\AC\Microsoft\Windows
Context before #16698900: \??\C:\Windows\rescache\_merged\2428212390\2218571205.pri
Context before #26715236: \??\C:\Windows\system32\dps.dll
As you can see it required quite a few comments, as the structure traversal was not very obvious from the code.
In REVEN 2.11, we can make the intent of the code clearer with the newest additions to the types
API:
>>> import reven2
>>> from reven2 import types
>>> server = reven2.RevenServer("localhost", 13370)
>>> ntoskrnl = next(server.ossi.executed_binaries("ntoskrnl"))
>>> nt_create_file = next(ntoskrnl.symbols("^NtCreateFile$"))
>>> object_attributes_ty = ntoskrnl.exact_type("_OBJECT_ATTRIBUTES")
>>> # Finding all files that are created in a call to NtCreateFile
>>> def read_filename(ctx):
... # filename is stored in a UNICODE_STRING structure,
... # which is stored inside of an object_attribute structure,
... # a pointer to which is stored as third argument (r8) to the call
... object_attributes = ctx.read(reven2.arch.x64.r8, types.Pointer(object_attributes_ty))
... unicode_string = object_attribute.field("ObjectName").deref()
... length = unicode_string.field("Length").read()
... return unicode_string.field("Buffer").deref_str(
... types.CString(encoding=types.Encoding.Utf16, max_size=length)
... )
You can now directly obtain the instance of the UNICODE_STRING
in the OBJECT_ATTRIBUTES
structure by dereferencing its ObjectName
field. Similarly, you can then obtain the filename buffer and its length by reading the appropriate unicode_string fields, so that you can interpret the buffer as an UTF16
string of characters.
The code is also less “fragile”, as it can now transparently resist a change of layout of the structs it manipulates.
Windows Handle inspection
The new reven2.preview.windows
package contains many utilities to help you analyze scenarios targeting Windows. The most significant of these is the ability to inspect the handles manipulated by Windows in the execution trace, as well as the system object they point to.
To do so, you’ll start by wrapping any reven2.trace.Context
object inside of a reven2.preview.windows.Context
object.
>>> from reven2.preview.windows import Context as WindowsContext
>>> ctx = WindowsContext(ctx)
The newly created ctx
behaves like a reven2.trace.Context
, but provides additional capabilities that are specific to Windows scenarios. For example, you can use the handles
method to list the file handles owned by the current process:
>>> # Focus on process handles, ignore the rest
>>> for handle in ctx.handles(kernel_handles = False, special_handles = False):
... try:
... # Request FileObject handles only, otherwise raise exception
... obj = handle.object(reven2.preview.windows.FileObject)
... except ValueError:
... continue
... print(f"{handle}: {obj.filename_with_device}")
Handle 0x4 (object: lin:0xffffc8016fc2c760): \Device\ConDrv\Reference
Handle 0x44 (object: lin:0xffffc8016fa92810): \Device\HarddiskVolume2\reven
Handle 0x48 (object: lin:0xffffc8016fa94a70): \Device\ConDrv\Connect
Handle 0x50 (object: lin:0xffffc8016fa921d0): \Device\ConDrv\Input
Handle 0x54 (object: lin:0xffffc8016fa91eb0): \Device\ConDrv\Output
Handle 0x58 (object: lin:0xffffc8016fa91eb0): \Device\ConDrv\Output
Handle 0xa4 (object: lin:0xffffc8016f8e3b00): \Device\HarddiskVolume2\reven\output
If you found a handle as an argument to some Windows function (such as NtReadFile
), you can also easily find the object it refers to:
>>> handle = ctx.handle(0xa4)
>>> print(handle.object())
File object "\Device\HarddiskVolume2\reven\output" (lin:0xffffc8016f8e3b00)
Analyzing handles makes reversing easier by getting a finer understanding of the data manipulated by the Operating System.
Write more robust scripts with gradual typing
Python has long supported optional type annotations that can be checking by external tools and IDE so as to provide more robust type inference and autocompletion, for the benefit of users.
While some typing has been present in the API for several releases now, REVEN 2.11 reaches a new milestone in typing, with almost 75% of the API now typed (according to mypy
). The remaining untyped parts are the most dynamically typed methods, such as reven2.Context.read
, tests and implementation details, and a few minor packages (such as the bookmark
module). We hope to complete the coverage in a future version of REVEN.
Here’s the read_filename
function from above, with the addition of type annotations:
>>> import reven2
>>> from reven2 import types
>>> server : reven2.RevenServer = reven2.RevenServer("localhost", 13370)
>>> ntoskrnl : reven2.ossi.Binary = next(server.ossi.executed_binaries("ntoskrnl"))
>>> nt_create_file : reven2.ossi.FunctionSymbol = next(ntoskrnl.symbols("^NtCreateFile$"))
>>> object_attributes_ty : types.Struct = ntoskrnl.exact_type("_OBJECT_ATTRIBUTES")
>>> # Finding all files that are created in a call to NtCreateFile
>>> def read_filename(ctx: reven2.trace.Context) -> str:
... # filename is stored in a UNICODE_STRING structure,
... # which is stored inside of an object_attribute structure,
... # a pointer to which is stored as third argument (r8) to the call
... object_attributes : types.StructInstance = ctx.read(reven2.arch.x64.r8, types.Pointer(object_attributes_ty))
... unicode_string : types.StructInstance = object_attribute.field("ObjectName").deref()
... length : int = unicode_string.field("Length").read()
... return unicode_string.field("Buffer").deref_str(
... types.CString(encoding=types.Encoding.Utf16, max_size=length)
... )
While these annotations are useful for making your script more robust with tooling, the return type of some expressions, such as ntoskrnl.exact_type("_OBJECT_ATTRIBUTES")
or object_attribute.field("ObjectName").deref()
is unknown at type checking time, because they rely on reading data in debug objects that are only present at runtime.
To alleviate this issue, the types.StructInstance
class provides convenience methods to assert the type of the read or dereferenced value:
>>> import reven2
>>> from reven2 import types
>>> server : reven2.RevenServer = reven2.RevenServer("localhost", 13370)
>>> ntoskrnl : reven2.ossi.Binary = next(server.ossi.executed_binaries("ntoskrnl"))
>>> nt_create_file : reven2.ossi.FunctionSymbol = next(ntoskrnl.symbols("^NtCreateFile$"))
>>> object_attributes_ty : types.Struct = ntoskrnl.exact_type("_OBJECT_ATTRIBUTES")
>>> # Finding all files that are created in a call to NtCreateFile
>>> def read_filename(ctx: reven2.trace.Context) -> str:
... # filename is stored in a UNICODE_STRING structure,
... # which is stored inside of an object_attribute structure,
... # a pointer to which is stored as third argument (r8) to the call
... object_attributes : types.StructInstance = ctx.read(reven2.arch.x64.r8, types.Pointer(object_attributes_ty))
... unicode_string = object_attribute.field("ObjectName").deref_struct() # <- deref_struct indicates we deref to a struct
... length = unicode_string.field("Length").read_int() # no need to annotate length with int for mypy or the IDE
... return unicode_string.field("Buffer").deref_str(
... types.CString(encoding=types.Encoding.Utf16, max_size=length)
... )
Using these methods, you don’t need to annotate the result of reading of deref’ing from a struct. These methods are also runtime checked, meaning that if you ever get it wrong, you will get a clean error at the point of the mistake rather than down the line in a confusing way.
More generally, we encourage you to use as much typing as possible in your scripts, and to check them with type checkers like mypy
, as this can save a lot of time in debugging for bigger scripts.
And More
- New scripts in the library: search the trace for the symbols accessing a given memory range, or conversely for the memory accesses made during the execution of a function symbol.
- New “vocabulary” types such as
MemoryRange
andRegisterSlice
. - Made it possible to record intereactions with some USB devices configured as USB passthrough.
- Continued performance improvements with the Memory History replay time reduced by 10%.
- The hovered transition is now displayed in the timeline.
The full list of improvements and fixes is available in the release notes.
Want to try REVEN? The full power of REVEN is available to test in the Free Edition. For a more guided experience, we also provide an extensive set of learning scenarios online, with tutorials available in most demos.
Interested in REVEN? Compare the features of REVEN Free, Professional and REVEN Enterprise.
Discuss on GitHub!
Discover Timeless Analysis Live.