pdfnaut¶
Warning
This library is currently in a very early stage of development. Expect bugs or issues.
pdfnaut aims to become a PDF processor for Python capable of reading and writing PDF documents.
pdfnaut provides high-level APIs for performing the following actions:
Reading compressed & encrypted PDF documents (AES/ARC4, see note below).
Inspecting PDF structure.
Viewing and editing document metadata.
Viewing and editing document outlines.
Appending, inserting, and removing pages.
Building PDFs from scratch.
Install¶
pdfnaut can be installed from PyPI. pdfnaut requires at least Python 3.10 or later.
python3 -m pip install pdfnaut
python -m pip install pdfnaut
Important
To use pdfnaut for reading encrypted or protected documents, you must also install a crypt provider as described in Standard Security Handler.
Examples¶
pdfnaut provides its core API via the PdfDocument class which allows performing common actions within a PDF. For example, to access the content stream of the first page in the document, you can do as follows:
from pdfnaut import PdfDocument
pdf = PdfDocument.from_filename("tests/docs/sample.pdf")
for operator in pdf.pages[0].content_stream:
print(operator)
Reading document information from a PDF is also simple:
from pdfnaut import PdfDocument
pdf = PdfDocument.from_filename("tests/docs/sample.pdf")
print(pdf.doc_info.title)
print(pdf.doc_info.author)