REVEN 2.2: Python API, Automatic Recording, and more
by Louis
Categories: Technical -
Tags: REVEN - Releases - Announcement - Automation - Analysis API - Workflow API -
Tetrane is happy to announce the recent release of REVEN 2.2.
REVEN is an automated Reverse Engineering Platform designed to go x10 faster & x10 deeper using Timeless Analysis. Technically, REVEN captures a time slice of a full system execution (CPU, Memory, Hardware events) to provide unique analysis features that speed up and scale your reverse engineering process.
This article covers the most important changes introduced in the REVEN 2.2 release.
For the 2.2 line of REVEN, the keyword is Automation, that is, ways to work with REVEN more productively and more in-depth by automating various tasks.
In details, this release is the first version to contain the new Analysis Python API, various facilities for Automatic scenario recording and the Workflow Python API.
Analysis Python API
The Analysis API enables you to query a scenario’s execution trace from a REVEN server by using Python scripts.
To demonstrate how you can use the Analysis API to fetch information from the trace, here’s an example script
(adapted from the quick start guide)
that gets the name of all files opened in the trace, by searching for calls to NtCreateFile
:
>>> from reven2 import address, types, arch, RevenServer
>>> # Connecting to the REVEN server on port 13370
>>> server = RevenServer("localhost", 13370)
>>> # Checking the number of instructions in the trace
>>> trace = server.trace
>>> print("{} instructions in this scenario!".format(trace.transition_count))
2159367608 instructions in this scenario!
>>> # Finding all files that are created in a call to NtCreateFile:
>>> # 1. Function to find the filename from a context at the start of NtCreateFile
>>> def read_filename(ctx):
... # filename is stored in a UNICODE_STRING structure,
... # which is stored inside of an object_attribute structure,
... # a pointer to which is stored as third argument (r8) to the call
... object_attribute_addr = ctx.read(arch.x64.r8, types.USize)
... # the pointer to the unicode string is stored as third member at offset 0x10 of object_attribute
... punicode_addr = object_attribute_addr + 0x10
... unicode_addr = ctx.read(address.LogicalAddress(punicode_addr), types.USize)
... # the length is stored as first member of UNICODE_STRING, at offset 0x0
... unicode_length = ctx.read(address.LogicalAddress(unicode_addr) + 0, types.U16)
... # the buffer is stored as third member of UNICODE_STRING, at offset 0x8
... buffer_addr = ctx.read(address.LogicalAddress(unicode_addr) + 8, types.USize)
... filename = ctx.read(address.LogicalAddress(buffer_addr),
... types.CString(encoding=types.Encoding.Utf16, max_size=unicode_length))
... return filename
...
>>> # 2. Find all calls to NtCreateFile in the trace
>>> # Get a handle to ntoskrnl!NtCreateFile symbol, so that we can search for calls in the trace
>>> nt_create_file = next(server.ossi.symbols("^NtCreateFile$",binary_hint="ntoskrnl"))
>>> for (index, ctx) in enumerate(trace.search.symbol(nt_create_file)):
... print("{}: {}".format(ctx, read_filename(ctx)))
...
Running this script gives us the output below. Note that #n
refers to the n
th instruction executed in the trace.
For each of these instructions, REVEN provides access to the machine context (CPU, memory, …) before executing the instruction,
as well as the context after executing the instruction. You can correlate an instruction number with its contexts using the
trace view of the REVEN GUI.
# Abridged output, some lines omitted
Context before #30318: \??\PhysicalDrive0
Context before #68939301: \??\C:\Users\reven\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\System Tools\Command Prompt.lnk
Context before #93015089: \DEVICE\NETBT_TCPIP_{9EEA2520-DBB0-11E7-B281-806E6F6E6963}
Context before #280484374: \Connect
Context before #282462242: \??\C:\Windows\AppPatch\AppPatch64\sysmain.sdb
Context before #282691186: \??\D:\named_pipe_helloworld_client.exe
Context before #288711211: \??\pipe\Pipe
Context before #296196812: \??\C:\Windows\Prefetch\NAMED_PIPE_HELLOWORLD_CLIENT.-86896CDB.pf
Context before #297700382: \??\C:\Windows\Prefetch\NAMED_PIPE_HELLOWORLD_CLIENT.-86896CDB.pf
Context before #297742002: \??\C:\Windows\Prefetch\NAMED_PIPE_HELLOWORLD_CLIENT.-86896CDB.pf
Context before #303014620: \??\PhysicalDrive0
Context before #310123576: \??\PhysicalDrive0
Context before #311761916: \??\C:\Users\reven\AppData\Roaming\Microsoft\Windows\Themes\CachedFiles\CachedImage_1024_768_POS4.jpg
Context before #312000488: \??\C:\Users\reven\AppData\Roaming\Microsoft\Windows\Themes\CachedFiles\CachedImage_1024_768_POS4.jpg
This script was run on our “named pipe” toy scenario, in which two programs communicate through a named pipe.
In the list of the filenames, we can see the opening of the example binary named_pipe_helloworld_client.exe
as well as
a prefetch file NAMED_PIPE_...pf
. Running this script takes around 10 seconds for this 2 billion instructions
trace.
For this release, supported features of the API include: reading from a Context or a Transition, OSSI, memory history, search, backtrace, strings, forward and backward tainting.
Note that the REVEN v2 Python API can be imported from IDA, allowing to combine information from the IDA Python API and the REVEN v2 Python API.
More information on the Python API is available in the quick start guide and in the Python API reference.
Automatic scenario recording
The automatic scenario recording feature enables you to automatically record scenarios. This feature uses the Workflow API. Two main workflows are supported today, automatic binary recording and automatic recording using ASM stubs.
Let’s dive a bit into each of these workflows.
Automatic binary recording
Automatic binary automates the recording during the execution of a binary. This feature automatically uploads a local binary to the virtual machine, starts recording the full system immediately after executing this binary, and stops recording automatically when the binary exits or crashes. This produces shorter scenario traces that are quick to replay and easy to analyze.
Note that REVEN will also stop recording on a Blue Screen of Death.
The following simple script performs a simple automatic recording of a hello_world.exe
binary already present on the guest VM:
>>> scenario = pm.create_scenario("otto_record", 2, description="Automatic binary recording example")
>>> response = pm.start_qemu_snapshot_session(2, live_snapshot="hello_world-2048M-nonet-nokvm")
>>> session_id = response['session']['id']
>>> binary_name = r"C:\Users\reven\hello_world.exe"
>>> params = {
... 'qemu_session_id': session_id, 'scenario_id': scenario['id'],
... 'binary_name': binary_name, 'autorun_binary': binary_name}
>>> task = pm.auto_record_binary(**params)
After executing this script, the Project Manager sports a freshly created otto_record scenario, already recorded and ready to be replayed (the Workflow Python API provides methods to automatically replay a recorded scenario).
After replay, opening the trace in Axion, we can check the record lasts only during the execution of the binary. It starts by an
instruction inside of hello_world.exe
, and ends with a call to PspExitProcess
after only 2,039,324 instructions.
Automatic recording using ASM stubs
The second form of automatic recording uses “magic” ASM instruction sequences to start and stop the record at any time from within the guest VM!
This form of automatic recording is very flexible, as you can decide to start and stop recording depending on the internal state of the guest VM, such as when executing a particular function with specific input values in some process, receiving a certain network packet, and so on…
Being more general, this method is also a bit more involved to setup. From the host side, you will use an autorecord script similar to the script for automatic binary recording, but using a different API method. From the guest side, you will need to execute specific ASM instructions to start the record, and other ASM instructions to stop the record. To simplify this process, we also provide a C library that encapsulates these ASM sequences in standard C functions.
For more information on automatic scenario recording, please refer to the automatic recording cookbooks.
Workflow Python API
The Workflow API enables you to automate the workflow of the Project Manager by using Python.
As a Workflow API primer, here is an example script that simply retrieves the list of scenarios and the port of the associated REVEN server if one is started:
>>> import reven2.preview.project_manager as project_manager
>>> # Connect to the project manager
>>> pm = project_manager.ProjectManager("http://localhost:8880")
>>> # Get the 50 first scenarios and print their name and REVEN server if any
>>> # 1. Print a table header
>>> print("{:20}| {}".format("Scenario name", "REVEN server port"))
>>> print("{:-<20}|--------------------".format(""))
>>> # 2. Get the 50 first scenarios
>>> for scenario in pm.get_scenarios_list(limit=50, offset=0)['results']:
... scenario = pm.get_scenario(scenario['id'])
... # 3. Retrieve REVEN server associated to scenario if any
... session = scenario['active_reven_session']
... if session is None:
... port = None
... else:
... port = pm.get_session(session['id'])['port']
... # 4. Print scenario name and port
... print("{:20}| {!r}".format(scenario['name'], port))
Running this script gives us the following output:
# abridged output, some lines omitted
Scenario name | REVEN server port
--------------------|--------------------
pipe | 45645
notepad | None
tokio | 34083
sed2 | None
cve-2016-7255 | None
ddref | 39311
Other improvements
Other improvements include the addition of a new Downloads
page to the Project Manager that allows to download various
REVEN tools and examples, performance improvement to the replay generation time, or the possiblity to use the taint
feature from a remote Axion client.
The full list of improvements and fixes is available in the release notes included with all REVEN distributions.