A couple of weeks ago, Quinn Shamblin posted his article on recovering mp3 data from unallocated space. This set me to thinking. The methods he described seemed generically applicable to other types of multimedia content, but I'm not an expert on those types of file formats, so I went looking. A few comments back and forth later (Thanks drpaha!), and I had a new tool to try out, Defraser. From the Sourceforge project page:
"Defraser is a forensic analysis application that can be used to detect full and partial multimedia files in datastreams. It is typically used to find (and restore) complete or partial audio/video files in datastreams (for instance, unallocated diskspace)"
I downloaded and installed the tool (Late note: It's under active development. As I'm writing this, they just released a new version), then exported all of the unallocated space (15GB of it) from a disk image I had handy, created a new project inside defraser, and added the unallocated space for analysis. After the processing completed (a few hours later), I simply selected the root of the displayed multimedia tree, right-clicked, and selected 'Save Selection as Separate Files', and it proceeded to dump out all of the media streams it had detected as about 600 individual files in the folder I specified. All was not beer & pretzels, however. Going through these to categorize them was quite tedious, as many were corrupted (at least as far as VLC Media Player, which I was using to play them, was concerned), and only a very few actually had any playable content.
Note, by the way, that while I'm primarily concerned with video content here, Defraser also extracts audio-only content.
What I really needed at this point was a way to dump out some reference frames from the video content so that I could easily identify those clips that contained responsive content. Google is your friend! A short bout of Internet research later, I'd discovered Ffmpeg. From its home page:
"Ffmpeg is a complete, cross-platform solution to record, convert and stream audio and video. It includes libavcodec - the leading audio/video codec library."
I compiled, and installed this tool under cygwin using the following commands:
./configure make make install
Unofficial precompiled Windows builds are also available from http://www.ffmpeg.org/
Then I simply changed directories into the folder where I'd saved the files extracted by Defraser, and ran the tool with the appropriate options on each of those files:
for I in *; do ffmpeg -i "$I" -y -ss 5 -an -sameq -r 1/5 "$I%03d.jpg"; done
The result was a set of .jpg files with the same initial names as the files from which they were derived, but with a three digit number dot jpg appended onto the end. To my delighted surprise, many of the 'corrupted' files which I had been unable to play with VLC were apparently parseable by ffmpeg, and so I found myself in possession of reference frames for them as well. The -ss 5 and -r 1/5 options in the above command line set the initial offset before the first reference frame, and the delay between subsequent frames to five seconds. In addition, the ffplay utility that comes with ffmpeg is sufficiently versatile that I'm now considering using it to replace VLC player as my primary multimedia viewer application. It will play many of the 'corrupted' files which Defraser extracts.
I tried this again on the same unallocated space file after the most recent update to Defraser, and it found a substantially larger number (over 142000) of video clips, probably due to the new mpeg-4 detector. The ffmpeg reference frame extraction technique was really valuable here, as it only extracted frames from the valid ones, and there weren't very many more of those. However, I did note, in reviewing a few of the larger clips manually, that it's apparently possible for a clip to be damaged in such a way that reference frame extraction will fail, but the clip will still be playable with ffplay. I also tried the new Defraser version on the unallocated space from another image, only 11GB this time, and it proceeded to run my x64 system with 8GB of RAM out of memory, and execute overnight before I finally killed it. I then broke the file down into 2GB chunks, and was able to analyze it, though the application did throw an odd error on several of the chunks (I filed a bug report). Deselecting the 3GPP/QT/MP4 Container Detector allowed me to successfully complete processing of those chunks. None of the chunks caused particularly high memory utilization by the application.
It's worth mentioning that the Defraser application itself displays the following notes:
Warning: Known issue: scanning a file using particular combinations of detectors may yield varying results.
Advice: First scan using the Container Detectors, optionally combined with Codec Detectors. To see if anything was missed, do a subsequent scan using only the Codec Detectors.
I'm not sure how comprehensive Defraser really is, but I found a number of interesting unallocated video clips on my test image, and both of these tools are definitely going in the box for future reference. Enjoy!
John McCash, GCFA Silver #2816, is currently a Forensic Investigator employed by a fortune 500 telecommunications equipment provider.