During the examination of malicious files, you might encounter shellcode that will be critical to your understanding of the adversary's intentions or capabilities. One way to examine this malicious code is to execute it using a debugger after setting up the runtime environment to allow the shellcode to achieve its full potential. In such circumstances, it's helpful to take control of the instruction pointer to direct the debugger towards the code you wish to examine.
The modern computer has been designed to make life easy for the standard user. It is actually quite difficult to say to the computer "Hey, I've found some shellcode embedded in a file, could you run it for me?", and for good reason! If you don't get it exactly right, the chances are you're going to end up crashing something.
Scenario walkthrough - Analysing embedded shellcode
I have devised a simplified scenario which will allow us to consider how to analyse shellcode embedded within malicious files. We will focus on the code analysis methodology rather than the vulnerability exploitation required to get it to execute in the first place. In Figure 1 below, we have an apparently innocuous text file which we are reviewing. At first glance there doesn't appear to be anything strange about this file, but looks can be deceiving!
Figure 1 - Innocuous looking text file
As we scroll down within the text editor we start to see some strange ASCII characters as shown in Figure 2. As this doesn't make contextual sense we will delve a little deeper to find out what is going on. A collection of nonsense characters could indicate shellcode which the editor is attempting to interpret as text and display for us.
Figure 2 - Discovery of strange characters
Having loaded the text file into a hex editor, we can now start to see a major indictor that exploit shellcode may be present. The repetitive 0x90 bytes look very much like a NOP sled. We can also see a number of other bytes in the mix who's meaning is less obvious. Highlighted in Figure 3 below we see '33 C0 40 3D EF BE AD DE 75 F8′ which in this case is our demo shellcode which we want to examine further.
At this point we have two options, we can carve this out and use a tool to convert it into something easier to review, however as detailed the article 'Analyzing Shellcode Extracted from Malicious RTF Documents' there are potential issues with analysing shellcode out of the environment it was originally embedded within. The other option is to take control of the instruction pointer, point it at the first byte of the embedded shellcode and begin executing at our convenience.
Figure 3 - View from a hex editor
Unlike an executable file, the operating system will not by design transfer control of the Instruction Pointer to a location within the file (why would it, there shouldn't be any executable code in there!).
I have developed an open source tool named 'jmp2it' which allows you to open any file and transfer control to a specified offset. It does this by mapping the contents of the file into memory and calculating the file base image + offset, before creating and calling a function pointer which represents the address you are interested in (as detailed above).
[View C++ Source Code]
In Figure 4 we can see the tool in action. Having provided the required parameters, the tool has mapped the file containing the shellcode to memory, injected an effective pause as inspired by Malhost-Setup and is now displaying instructions on how to load this into a debugger and begin the analysis.
Figure 4 - jmp2it - post execution
Having followed the instructions above, we now find ourselves with the shellcode loaded into a debugger and we can now step through it to allow for an in-depth analysis of the code. We find ourselves in the NOP sled, and several manual steps will take us down to the demo shellcode, to which we can apply a variety of techniques.
Figure 5 - shellcode as loaded into a debugger
I have no doubt there are a number of other excellent tools which assist with this process (including Malhost-Setup, part of the OfficeMalScanner suite), however I could not find one with additional functionality I required, which prompted me to write and release this as open source so others may tweak it if required. I have included a copy of the help file below which details current functionality.
|jmp2it.exe usage & functionality|
|Microsoft Windows [Version 6.3.9600]|
(c) 2013 Microsoft Corporation. All rights reserved.
** JMP2IT v1.4 - Created by Adam Kramer  - Inspired by Malhost-Setup **
This will allow you to transfer EIP control to a specified offset within a file
containing shellcode and then pause to support a malware analysis investigation
The file will be mapped to memory and maintain a handle, allowing shellcode
to egghunt for second stage payload as would have happened in original loader
Patches / self modifications are dynamically written to jmp2it-flypaper.out
Example: jmp2it.exe malware.doc 0x15C
Explaination: The file will be mapped and code at 0x15C will immediately run
Example: jmp2it.exe malware.doc 0x15C pause
Explaination: As above, with JMP SHORT 0xFE inserted pre-offset causing loop
Example: jmp2it.exe malware.doc 0x15C addhandle another.doc pause
Explaination: As above, but will create additional handle to specified file
Optional extras (to be added after first two parameters):
addhandle - Create an arbatory handle to a specified file
Only one of the following two may be used:
pause - Inserts JMP SHORT 0xFE just before offset causing infinite loop
pause_int3 - Inserts INT3 just before offset
Note: In these cases, you will be presented with step by step instructions
on what you need to do inside a debugger to resume the analysis
A feature of particular note, is that any in-memory modification of the image is dynamically written to an output file on disk. This will hopefully assist in the analysis of self modifying and unpacking code, and will retain a copy of any patching done by the analyst.
I would very much welcome feedback!
— Adam (Twitter: @CyberKramer)