SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals

Experience SANS training through course previews.
Learn MoreLet us help.
Contact usBecome a member for instant access to our free resources.
Sign UpWe're here to help.
Contact UsThis blog post is the second in a series of three blog posts dedicated to quick and useful techniques for analyzing binaries. In my first post, I talked about how penetration testers and other analysts can find and isolate network traffic generated by a binary. This time we'll look at pillaging the various data files that binaries and applications leave lying around. Our focus will again be on the Windows side of things, as that's where we often find the juiciest applications to analyze, including client-side software and internal tools.
In Windows, application data files are often placed in the AppData folder. This folder, specific to each user, can be a gold mine of information about installed applications. Located in each users' home directory, AppData (Application Data on Windows XP) can be accessed using the %AppData% environment variable, as in "dir /a %AppData%" (I put in the /a after the dir command to show folders with any combination of attributes; sometimes items in the AppData directory have a hidden attribute, so we want to see all them with dir /a).
Here are some useful files to keep an eye out for:
To analyze these application files, as well as any files found in the application's main directory, we can employ a number of different techniques. First and foremost, our good friend, the strings command, proves to be just as useful in analyzing data files as it is on binary files. Be on the lookout for the same type of data we talked about in the first blog post, URLs and IP addresses especially. Strings is freely downloadable for Windows from Microsoft Sysinternals. It's also worthwhile noting that the Sysinternals strings looks for both ASCII and Unicode strings by default, returning sequences of three characters or more in length.
Once we have some files on our hands, and we've done some cursory analysis with strings, we've likely either encountered a cleartext ASCII file (XML is especially common!) or we have a binary data file on our hands. If that's the case, we have two easy-to-use sources of information to help us identify the type of file. The file extension itself (such as .asc, .xml, or .docx) and the Linux "file" command. The file command can also be downloaded for Windows from GnuWin32. We can use it as follows:
$ file data.ext
data.ext; gzip compressed data, from Unix, last modified: Mon Mar 11 14:35:58 2013
Let's take a look at some of the more common data file formats we run into and how to extract data from them.
If the file format isn't in the above list, Google is a great resource for finding tools to analyze specific file types. If all else fails, see if a local installation of the application that created the file can be used to read it and extract its data. A great example of this would be copying a Firefox profile locally and importing into a local Firefox installation to view cookies and history.
In our next (and last) installment of this blog post series, we will look at some simple techniques for decompiling executables and DLLs to give us that last bit of insight we might need. After all, what could be more juicy when analyzing a binary than being able to peer right at the source? Until then... signing off —