Last Chance - iPad Pro, Surface Pro 4, or $500 Off with SANS Online Training ends this week

PDF Malware Overview

Author: Joel Yonts
July 19, 2010

Introduction to PDF Malware

PDF files are so common today it is hard to imagine (or remember) what life was like without them.  Business proposals, product manuals, legal documents, and online game guides are just a sampling of places we see the Portable Document Format.  The utility of this format comes in its ability to deliver "rich" content to a wide array of end users with little regard to the platform or viewer application.  Since its initial release in 1993 (Adobe Systems, Inc.), this format has risen in popularity to become the de facto standard for information sharing.  

The "rich" content elements described by this standard include both static and dynamic elements. Table 1 presents an abbreviated list of common PDF elements. 

Static ElementsDynamic Elements
  • Text Blocks & Styles
  • Character Encodings & Font Selection
  • Multimedia Support
  • Embedded JavaScript & ActionScript Support
  • Dynamic Action Triggers (i.e. "On Open")
  • Retrieve "Live" Data (Network URL Based)

Table 1: Common Static & Dynamic PDF Features

Combined, these elements can deliver a visually appealing, interactive, and portable document. While we have all benefited from this feature rich information sharing venue there exists a darker side.  The dynamic PDF capabilities mentioned above can and has been used to house malicious content.  As far back as 2001 (Peachy Worm) we have seen cyber criminals utilize embedded malicious scripts and other dynamic PDF features to install malware and steal user credentials. While the goals and technical payloads of these PDFs have changed over the years, the pattern for creating a malicious PDF remains largely unchanged.

Pattern of PDF Malicious Behavior

Typically, the malicious behavior in PDF malware is contained within one or more embedded scripts.  These embedded scripts can be written in any of the PDF supported scripting languages, with JavaScript being the most popular.  In most cases these scripts implement "dropper" functionality whereby additional OS based malware is installed on the victim's system.  Figure 1 describes in greater detail a typical PDF malware infection.

Malicious PDF Behavior

Figure 1: Pattern for Malicious PDF Execution

1. User opens a malicious PDF (Document loaded into PDF Viewer)
2. An embedded script is set to execute "On Open"
3. Script extracts and decodes embedded malware
-- and / or --
4. Script downloads malware from an internet site
5. New malware is installed on the victim's system

Makers of PDF Viewers quickly realized these dynamic features required additional safeguards to ensure they were not used in this malicious fashion.  As a result, PDF Readers began warning the users for actions such as internet access or local filesystem access. 

Launch File PDF Warning

Figure 2: Warning of Potential Malicious Behavior

This posed a minor challenge to the attacker.  To counter, the attacker would often use additional social engineering or make the target so potentially enticing that the user ignores the warning and dismisses the warning dialog.  

A more sophisticated approach to this attacker dilema was solved by using specially crafted PDF Reader exploits.  In this scenario, attackers embed exploit code within the PDF document that is designed to bypass the Reader's security controls and execute the malicious content without warning the user.  

Buffer Overflow & Heap Spraying

The combination of Buffer Overflow  +  Heap Spraying is the most common exploitation utilized by malicious PDFs.  The BOF vulnerability usually attacks one or more of the PDF Reader's parsing engines with the intent of flowing data past the end of a buffer boundary. The attacker ensures this "overflow" data is actually shellcode (a small program written in machine code) that will give the attacker additional control over the system when executed.

The attacker rarely has control of where this "overflow" data is written so the attacker increases their chance of getting their malicious code to execute by writing it into many memory areas.  The technique of writing shellcode to multiple heap memory areas is known as Heap Spraying.

This approach to exploitation has the potential to infect a larger user population but has its limitations.  These exploits are PDF Reader specific and are usually somewhat short lived (or should be with a good patch management program). 

Figure 3 shows the history of vulnerabilities discovered within the popular Adobe Reader.

adobe reader vulnerabilities

Figure 3: A History of Adobe Reader Vulnerabilities (National Vulnerability Database)

Most of the high vulnerabilities noted in Figure 3 were/are candidates for malicious PDFs.  It is easy to see that even if these exploits are short-lived, the rate of escalating occurrence makes them a considerable issue.

PDF Malware Life-Cyle

To date PDF Malware has fallen into the purely Trojan category of malware.  As with other Trojans, there is good news in that your known-good PDFs will not become "infected" after opening a malicious PDF.  Each malicious PDF is custom made and contains no reproductive capabilities.  Once created these PDFs are primarily delivered via SPAM email with web sites hosting these malicious PDFs being a distant second.  In either case, social engineering is usually involved that entices users to open these files and unleash the hidden dynamic content.  The social engineering ranges from messages with poor grammar and spelling to highly sophisticated targeted attacks that has the potential to fool even the most highly trained users.  

Defending Against PDF Malware

A good enterprise defense against PDF Malware begins with a strong email and web filter.  The goal of this layer is to greatly reduce the volume of malicious PDFs that make it into the enterprise's backend systems.  The volume of malicious PDFs that make it through the initial filtering layer should be further reduced by passing through layers of IPS, Anti-Virus Scanning, and potentially sandboxing technology.  The small percentage of PDF malware that makes it to the end user is hopefully met by a well trained and aware user that knows the potential dangers lurking in suspect PDFs.

One very powerful augment to this defensive approach is the implementation of application controls to limit potentially malicious PDF Reader behaviors.  Examples of application controls include:

  1. Disabling JavaScript support within the PDF Reader
  2. Disabling automatic rendering of PDFs in browsers
  3. Block PDF Readers from accessing the filesystem and network resources using Host IPS, Process Control, or Process Whitelisting Technology
While application controls can be very effective, it may brake some desirable user functionality and may prevent the Reader from patching itself.  Both of these obstacles can be overcome but care should be taken when imposing these controls.

More on PDF Malware

PDF Malware: Pidief

References

  1. Adobe Acrobat family: Adobe Portable Document Format
  2. PDF Malware Analysis
  3. National Vulnerability Database
  4. How to mitigate Adobe PDF malware attacks