PDF Metadata Extraction with Python

This paper explores techniques for programmatically extracting metadata from PDF files using Python. It begins by detailing the internal structure of PDF documents, focusing on the internal system of indirect references and objects within the PDF binary, the document information dictionary metadata...
By
Christopher A. Plaisance
February 5, 2019

All papers are copyrighted. No re-posting of papers is permitted

470x382_Generic_Whitepaper.jpg