SANS Digital Forensics and Incident Response Blog | Four Focus Areas of Malware Analysis

An analyst is in the middle of a case and finds an executable artifact. In searching the hash online, running a local copy of jsunpack and Virustotal does not produce any results How does one examine the exe to determine if it was evidence in the case? To do this, the analyst looks at behavior, code, memory, and intelligence analysis.

Behavior

Running the malware in a lab, behavior analysis reveals what services it installed, what registry keys it updated, what sockets it talked on, and what dlls it called to run. There are various tools to do this, such as Capture.bat and Regshot. Looking at the artifacts with Wireshark and FakeDNS, the analysis asks, "Does the software make a connection?" Standard tools, like Process Explorer, will reveal strings (look for strings, like www and http, for call outs to the net). Process Explorer will also verify digitally signed executables from a legitimate software company (keep in mind malware writers are digitally signing these also).

Memory

The executable can be then examined while running in live memory. This is called memory analysis. There are various tools, such as Mandiant's Memorize, whose purpose is to rate exe anomalies. There is a great python script that Harlan Carvey blogged about (link here), called pescanner.py, that revealed some helpful insights of the executable.
Executables can hide by circumventing the linked-list through the bypassing of the link before and after the malicious process. More details of this process can be found in the book "Rootkits: Subverting the Windows Kernel." The Volatility framework has a tool, called psscan2, to examine hiding processes and some other excellent malware plugins.

Code

At this point, if one does not see much malicious behavior and the newly generated artifacts do not match what have been found in the case, one takes a look at code analysis. The risk is lower, but one would like to dig a little deeper.
The first item to be investigated is to ask if the exe is packed? To evade anti-virus (A/V) malware writers will typically compress the code with a packer, such as UPX. To determine if an exe is packed run the exe through LordPE or the Norman's service will reveal such information.

An example that has been run through the program Norman Sandbox:

DansProg.exe : Not detected by Sandbox (Signature: NO_VIRUS)

[ DetectionInfo ]
* Filename: C:\analyzer\scan\DansProg.exe.
* Sandbox name: NO_MALWARE
* Signature name: NO_VIRUS.
* Compressed: NO.
* TLS hooks: NO.
* Executable type: Application.
* Executable file structure: OK.
* Filetype: PE_I386.

[ General information ]
* File length: 28160 bytes.
* MD5 hash:
* SHA1 hash:

In the lab, the exe can be examined with a disassembler, like IDA Pro, or a debugger, like OllyDbg. Looking at the assembly code will take time. Some items to look for are risky functions and API calls, like CreateMutex (used to reserve thread exclusiveness), ConnectNamedPipe (used to create malware backdoors), ShellExecute (used to execute an additional program), InternetOpen (used to initialize an internet connection), InternetOpenUrl (used to open http or ftp session), InternetReadFile (used to read data file from a file downloaded). This can be done in IDAPro or in running the program in a debugger, OllyDbg, and using the control+n shortcut. One then asks, "Are the API calls doing what the program should do or is there additional function to investigate?" A final check for Thread Local Storage (TLS) callbacks for malware defense can be checked (further explanation). Once the artifacts are collected, it is a good idea to preserve any findings. A good suggestion is the OpenIOC Framework to feed the next component, intelligence.

Intelligence

Intelligence, as it relates to the environment, is important to the malware analysis process. One needs to know what the infrastructure is made of and where the sensitive data is to apply the intelligence gathered. For example, where sensitive data resides on the network and what security controls are around that data are both important to put malware analysis into perspective. If an environment is not running SCADA devices, then the intelligence gathered about SCADA vulnerabilities are moot. However, the intel about a particular piece of malware, as it relates to existing security controls in the environment, is important to the overall risk to the enterprise. Open intel sources can be of value to determine the weight of the malware risks. An example would be searching the link, http://www.exploit-db.com/, for a service or network device exploit running in that particular environment and its risk exposure in said environment.
Intelligence can be classified as fact, information, and direct information or indirect information. With intelligence, some general questions to ask are, "Is it true?", "Is it the whole truth?", "Is it nothing but the truth?" (Clark 121), "What stage is the malware in the life cycle code?" , and "Is it closed source, as an open proof of concept, weaponized in an exploit kit?" It is important to keep in mind most malware is easily code as "fully undetected" also known as FUD from A/V. Whether examining an executable during a case, the risks of using a free VPN program, or a new malware cleaner, it is worth taking a few minutes to also examine the artifacts and code of the executable.

For more details on many of these processes take a look at the class SANS 610 and the new material in class SANS 508.

Clark, Robert. Intelligence Analysis A Target Centric Approach, 2nd Ed. CQ Press, 2007.