Train with the World's Best Cybersecurity Instructors - Special Offers Available Now - Learn More!

Reading Room

Subscribe to SANS Newsletters

Join the SANS Community to receive the latest curated cyber security news, vulnerabilities and mitigations, training opportunities, and our webcast schedule.






Forensics

Featuring 102 Papers as of March 3, 2021

  • PDF Metadata Extraction with Python by Christopher A. Plaisance - February 5, 2019 

    This paper explores techniques for programmatically extracting metadata from PDF files using Python. It begins by detailing the internal structure of PDF documents, focusing on the internal system of indirect references and objects within the PDF binary, the document information dictionary metadata type, and the XMP metadata type contained in the file’s metadata streams. Next, the paper explores the most common means of accessing PDF metadata with Python, the high-level PyPDF and PyPDF2 libraries. This examination discovers deficiencies in the methodologies used by these modules, making them inappropriate for use in digital forensics investigations. An alternative low-level technique of carving the PDF binary directly with Python, using the re module from the standard library is described, and found to accurately and completely extract all of the pertinent metadata from the PDF file with a degree of completeness suitable for digital forensics use cases. These low-level techniques are built into a stand-alone open source Linux utility, pdf-metadata, which is discussed in the paper’s final section.

  • View All Forensics Papers

Most of the computer security white papers in the Reading Room have been written by students seeking GIAC certification to fulfill part of their certification requirements and are provided by SANS as a resource to benefit the security community at large. SANS attempts to ensure the accuracy of information, but papers are published "as is". Errors or inconsistencies may exist or may be introduced over time as material becomes dated. If you suspect a serious error, please contact webmaster@sans.org.

All papers are copyrighted. No re-posting or distribution of papers is permitted.

SANS.edu Graduate Student Research - This paper was created by a SANS Technology Institute student as part of the graduate program curriculum.