Blog home

Interactive write-ups with REVEN and Jupyter

Sep 09, 2020
by Louis
Categories: Tutorial -
Tags: Reverse Engineering - Analysis API - REVEN -

OK, so you just recorded this nice trace of the latest malware-of-the-week, great! You went around the timeline in Axion, did some symbol search, it all looks very promising. Time to dive into the depths of the analysis…

What was the name of that function again? The one that’s probably an entry point?

Err, you should have written that down. Mechanically, you fire up your favorite editor (some vim or VSCode, maybe). Ah, you had that function’s name bookmarked in Axion, OK, let’s write it in the editor.

As time passes, you realize that you’re gonna need to check if the malware opens the registry, and what it writes there. OK, time for some Python: you fire up another editor, and write down some calls to the API! Hmmm, where should you put the results of this script? Add it to your notes? How can you reference that script that you just wrote? At this point, you start feeling dizzy and go for a drink. Is it already this late? Maybe you should call it a day, after all, you’ve done nice progress on this trace already. So uh, where should you save your script and notes?

Sounds familiar? This might have been you workflow when working with REVEN up to 2.4. It is, of course, a fine workflow. However, it requires you to switch between Axion, Python, and your notes, which involves copying and pasting a lot of transition numbers into Axion. Also, in this workflow, you have to make sure that you save all of these in a location related to your scenario.

A new workflow

Starting with REVEN 2.5, you have another possibility: you can use the Jupyter notebook server that is integrated with REVEN to host your write-ups, be it your scripts or notes. REVEN lets you create dedicated notebooks for each scenario, and the notebooks are even exported with the scenario!

In case you wouldn’t know about Jupyter notebook is a web interface that allows, among other things, to execute Python code and prepare Markdown write-ups from inside your browser.

To start using Jupyter for your scenario, go to the “Analyze” page of your scenario in the Project Manager, and click “Open Python”. This will open the UserData directory of your project in the Jupyter web view.

Note that, this time, the role of the malware will be filled by notepad.exe. Many thanks to notepad.exe!

The UserData directory is the place for putting any files related to your scenario. There is one such directory per REVEN scenario, and it is exported with your scenario, so if you share or archive your scenario its UserData directory will follow along. Note that, by default, the UserData directory already contains a bookmarks.sqlite file. Those are your bookmarks for this scenario, so don’t remove that file! From the Jupyter interface, you can either upload existing files or create new ones, including Jupyter notebooks. When creating a notebook, you will have to choose which version of Python to use (also known as a “Jupyter kernel”). For the Analysis API to work, make sure to choose a version with “reven” in the name, such as “reven-2.5.0-python3”.

In your brand new notebook, you can start by writing text, which will be interpreted as markdown.

But the real power of Jupyter notebooks stems from the fact that they can also execute python code! By default, if you enter python code in a new cell (one with the In [ ]: prompt), running the cell will run the corresponding python code. The global state is also kept between executions of the various cells, and the content of the cells are saved in the notebook, providing the perfect blend between an interactive interpreter and a script.

What’s of interest to us, though, is that you can use the REVEN python API from the notebook: it is only a single import reven2 away. We also provide a means of easily connecting to your REVEN server by just copy-pasting a snippet available on the “Analyze” page of your scenario, once the REVEN server has been started.

Users of the Enterprise edition can also use the Project Manager API to connect to their scenario directly from its name rather than by using the port. This has the advantage that you won’t have to change that cell when you restart a REVEN server on that scenario later.

# connect to a scenario from its name
from reven2.preview.project_manager import ProjectManager
pm = ProjectManager("http://localhost:8880")  # URL to the REVEN Project Manager, should not change very often
connection = None
connection = pm.connect("notepad")  # No need to specify the port of your scenario
server = connection.server
server

Using the API in a Jupyter notebook

Once you’re connected, you can use the Analysis API as per usual. For example, if you’d like to see what registry keys where accessed during the trace by analyzing calls to RegOpenKeyExW, you can run the following:

import reven2.arch.x64 as regs

# get which keys of the Windows registry were opened during the trace
from reven2 import types
for symbol in list(server.ossi.symbols(binary_hint="kernelbase",
                                       pattern="^RegOpenKeyExW$")):
    print("Opened registry keys")
    for ctx in server.trace.search.symbol(symbol):
        tr = ctx.transition_before()
        h_key = ctx.read(regs.rcx)
        subkey_address = ctx.read(regs.rdx,
                                  types.Pointer(types.USize))
        subkey = "null"
        process = ctx.ossi.process()
        if (subkey_address.offset != 0):
            subkey = ctx.read(subkey_address,
                              types.CString(encoding=types.Encoding.Utf16,
                                            max_character_count=100))
        print(tr, process, h_key, subkey)
        print()

# Note: In actual reversing situations, you'd probably want to build the
#       entire requested key by tracking the handles returned by
#       RegOpenKeyExW and the associated subkeys. This simple version only
#       serves for illustrative purposes.

Which, when run in the notebook, will produce the following:

Notice how the result is now integrated to your write-up and can be commented using markdown in the surrounding cells. Also, this output can be easily improved to take advantage of a REVEN feature: Jupyter-Axion synchronization.

Jupyter-Axion synchronization

To start using Jupyter Axion synchronization, naturally you need an instance of Axion started on your scenario. You can start Axion from the “Analyze” page of your scenario, by clicking the “Open Axion” button. Once Axion is started, you also need to enable the synchronization by selecting a session in the dropdown list in the menu bar. For now, “Default session” will do fine.

Once synchronization is enabled, you can communicate from any python client to Axion. For example, execute the following code to tell Axion to go to transition 42:

# Go to transition 42
server.sessions.publish_transition(server.trace.transition(42))

In the context of Jupyter notebooks, you can do even better: replace print by display in your registry code:

# code to get writes to the registry with display
for symbol in list(server.ossi.symbols(binary_hint="kernelbase",
                                       pattern="^RegOpenKeyExW$")):
    print("Opened registry keys")
    for ctx in server.trace.search.symbol(symbol):
        tr = ctx.transition_before()
        h_key = ctx.read(regs.rcx)
        subkey_address = ctx.read(regs.rdx, types.Pointer(types.USize))
        subkey = "null"
        process = ctx.ossi.process()
        if (subkey_address.offset != 0):
            subkey = ctx.read(subkey_address,
                              types.CString(encoding=types.Encoding.Utf16,
                                            max_character_count=100))
        display(tr, process, h_key, subkey)

That’s right, you can click the transition number and Axion will follow immediately.

Improving the output

You can further improve the output by using the HTML rendering capabilities of Jupyter. For instance, it would be nice to present the output as a table.

Let’s add some support functions in a new cell:

# helper functions to output html tables
def table_line(cells):
    """
    Build an HTML line of the table from an iterable
    """
    line = ""
    for cell in cells:
        line += "<td>{}</td>".format(cell)
    return "<tr>{}</tr>".format(line)


def html_table(title, headers, html_lines):
    """
    Build an html table

    title: h2 title displayed above the table
    headers : iterable that constitutes the header of the table
    html_lines: HTML lines obtained by calling table_line
    """
    header_line = ""
    for header in headers:
        header_line += "<th>{}</th>".format(header)
    header_line = "<tr>{}</tr>".format(header_line)
    return HTML("""<h2>{}</h2><table>{} {}</table>""".format(title, header_line, html_lines))

And then rewrite our function to output a table:

# code to get writes to the registry, outputs an html table
for symbol in list(server.ossi.symbols(binary_hint="kernelbase", pattern="^RegOpenKeyExW$")):
    lines = ""
    for ctx in server.trace.search.symbol(symbol):
        tr = ctx.transition_before()
        h_key = ctx.read(regs.rcx)
        subkey_address = ctx.read(regs.rdx, reven2.types.Pointer(reven2.types.USize))
        subkey = "<i>NULL</i>"
        process = ctx.ossi.process()
        if (subkey_address.offset != 0):
            subkey = ctx.read(subkey_address, types.CString(encoding=types.Encoding.Utf16, max_character_count=100))
        lines += table_line((tr._repr_html_(), process, h_key, subkey))

    display(html_table("Opened registry keys", ("Transition", "Process", "Key Handle", "Subkey"), lines))

Here’s the result:

The very useful ability of clicking notebook link to control Axion allows you to easily work interactively between Python and the GUI, and also to create automated analysis scripts that generate a report of points of interest (PoI), such as the following tokio-chat example:

When you’re satisfied with the output, you can even export the notebook to various formats: HTML, TeX or PDF (if you have LaTeX installed).

To go further

Notebooks offer a lot of possibilities, even allowing you to build interactive views, not dissimilar to Axion’s own widgets. In the linked example, we use the value of a slider widget as input to a python function that outputs a HTML table describing the current state of allocations in a GC block in Internet Explorer. If you want to see and interact with the widget, you can try the corresponding online demo.

Synchronization has more than one trick up its sleeve, too. As you could see we only used the “Default session” in this article, but when selecting the session in Axion, you can actually choose to enter a new session name, and then configure your scripts to synchronize with one or multiple sessions. With the Enterprise edition, this allows multiple users to work on the same trace and to use synchronization without accidentally selecting transitions in each other’s Axion instances.

In the future we might extend synchronization to allow opening hexdumps or to synchronize with different sources like WinDbg… what would you like to see?

If you want to get an idea of how working with REVEN and Jupyter feels, you can try example notebooks interactively online. If you already have REVEN installed, you can also grab the notebook from our github.

Being able to use Python directly for each project and synchronize it to Axion may feel like a mere detail, but it is actually paramount to providing the cohesive end-to-end experience that Tetrane is committed to deliver with REVEN.

Next post: Announcing REVEN version 2.6
Previous post: Timeless Full-System analysis with REVEN and WinDbg