Objects Reference¶
Alongside the basic object types documented in the COS Objects Reference, pdfnaut implements high-level objects mainly for use with PdfDocument.
Action Objects¶
- class pdfnaut.objects.actions.Action[source]¶
Bases:
PdfDictionaryAn action instructs the PDF reader to perform an action such as opening an application, going to a page in the document, or playing a sound, when activating an annotation or outline item.
See ISO 32000-2:2020 § 12.6 “Actions” for details.
- __init__(subtype: Literal['GoTo', 'GoToR', 'GoToE', 'GoToDPart', 'Launch', 'Thread', 'URI', 'Sound', 'Movie', 'Hide', 'Named', 'SubmitForm', 'ResetForm', 'ImportData', 'SetOCGState', 'Rendition', 'Trans', 'GoTo3DView', 'JavaScript', 'RichMediaExecute'], next_action: list[Action] | Action | None = None) None[source]¶
- property next_action: list[Action] | Action | None¶
The next action or sequence of actions that shall be performed after this action.
- subtype: Annotated[Literal['GoTo', 'GoToR', 'GoToE', 'GoToDPart', 'Launch', 'Thread', 'URI', 'Sound', 'Movie', 'Hide', 'Named', 'SubmitForm', 'ResetForm', 'ImportData', 'SetOCGState', 'Rendition', 'Trans', 'GoTo3DView', 'JavaScript', 'RichMediaExecute'], 'name']¶
The type of action.
Refer to ISO 32000-2:2020 “Table 201 - Action types” for available types.
- class pdfnaut.objects.actions.GoToAction[source]¶
Bases:
ActionA go-to action changes the view to a specified destination.
See ISO 32000-2:2020 § 12.6.4.2 “Go-To actions” for details.
- __init__(destination: PdfName | PdfHexString | bytes | Destination, next_action: list[Action] | Action | None = None) None[source]¶
- property destination: PdfName | PdfHexString | bytes | Destination¶
The destination to jump to.
- subtype: Annotated[Literal['GoTo', 'GoToR', 'GoToE', 'GoToDPart', 'Launch', 'Thread', 'URI', 'Sound', 'Movie', 'Hide', 'Named', 'SubmitForm', 'ResetForm', 'ImportData', 'SetOCGState', 'Rendition', 'Trans', 'GoTo3DView', 'JavaScript', 'RichMediaExecute'], 'name']¶
The type of action.
Refer to ISO 32000-2:2020 “Table 201 - Action types” for available types.
- class pdfnaut.objects.actions.URIAction[source]¶
Bases:
ActionA URI action causes a URI or uniform resource identifier to be resolved.
See ISO 32000-2:2020 § 12.6.4.8 “URI actions” for details.
- __init__(uri: str, is_map: bool = False, next_action: list[Action] | Action | None = None) None[source]¶
- subtype: Annotated[Literal['GoTo', 'GoToR', 'GoToE', 'GoToDPart', 'Launch', 'Thread', 'URI', 'Sound', 'Movie', 'Hide', 'Named', 'SubmitForm', 'ResetForm', 'ImportData', 'SetOCGState', 'Rendition', 'Trans', 'GoTo3DView', 'JavaScript', 'RichMediaExecute'], 'name']¶
The type of action.
Refer to ISO 32000-2:2020 “Table 201 - Action types” for available types.
- pdfnaut.objects.actions.action_into(mapping: PdfDictionary) Action[source]¶
Converts a dictionary
mappinginto a correspondingActionsubclass.
Annotation Objects¶
- class pdfnaut.objects.annotations.Annotation[source]¶
Bases:
PdfDictionaryAn annotation associates an object such as a note, link, or multimedia element with a location on a page of a PDF document.
See ISO 32000-2:2020 § 12.5 “Annotations” for details.
- __init__(kind: Literal['Text', 'Link', 'FreeText', 'Line', 'Square', 'Circle', 'Polygon', 'PolyLine', 'Highlight', 'Underline', 'Squiggly', 'StrikeOut', 'Caret', 'Stamp', 'Ink', 'Popup', 'FileAttachment', 'Sound', 'Movie', 'Screen', 'Widget', 'PrinterMark', 'TrapNet', 'Watermark', '3D', 'Redact', 'Projection', 'RichMedia'], rect: Iterable[float], contents: str, name: str, *, indirect_ref: PdfReference | None = None) None[source]¶
- color: PdfArray[float] | None¶
An array of 0 to 4 numbers in the range 0.0 to 1.0, representing a color used for the following purposes:
The background of the annotation’s icon when closed.
The title bar of the annotation’s popup window.
The border of a link annotation.
The number of array elements determines the color space in which the color shall be defined: 0 is no color or transparent; 1 is grayscale; 3 is RGB; and 4 is CMYK.
- contents: str¶
The text contents that shall be displayed when the annotation is open or, if this annotation kind does not display text, an alternate description of the annotation’s contents.
- flags: AnnotationFlags¶
Flags specifying various characteristics of the annotation.
- kind: Literal['Text', 'Link', 'FreeText', 'Line', 'Square', 'Circle', 'Polygon', 'PolyLine', 'Highlight', 'Underline', 'Squiggly', 'StrikeOut', 'Caret', 'Stamp', 'Ink', 'Popup', 'FileAttachment', 'Sound', 'Movie', 'Screen', 'Widget', 'PrinterMark', 'TrapNet', 'Watermark', '3D', 'Redact', 'Projection', 'RichMedia']¶
2020 “Table 171 — Annotation types” for details.
- Type:
The kind of annotation. See ISO 32000-2
- language: str | None¶
(PDF 2.0) A language identifier specifying the natural language for all text in the annotation except where overridden by other explicit language specifications
See ISO 32000-2:2020 § 14.9.2 “Natural language specification” for details.
- class pdfnaut.objects.annotations.AnnotationBorderStyle[source]¶
Bases:
PdfDictionaryThe border style for the outline that surrounds an annotation.
See ISO 32000-2:2020 § 12.5.4 “Border styles” for details.
- __init__(width=1, style='S', dash_pattern=None)¶
- dash_pattern: list[int | float] | None¶
The dash pattern that will be used for the border if the style specified is dashed. The array consists of alternating dashes and gaps. The dash phase is not specified and is assumed to be zero.
- style: Literal['S', 'D', 'B', 'I', 'U']¶
The border style. May be either of the following:
S: A solid rectangle.
D: A dashed rectangle specified by
AnnotationBorderStyle.dash_pattern.B: A simulated embossed (beveled) rectangle.
I: A simulated engraved (inset) rectangle.
U: An underline.
- class pdfnaut.objects.annotations.AnnotationFlags[source]¶
Bases:
IntFlagFlags for a particular annotation.
See ISO 32000-2:2020 § 12.5.3 “Annotation flags” for details.
- HIDDEN = 2¶
Do not render the annotation or allow user interaction with it.
- INVISIBLE = 1¶
If the annotation is non-standard, do not render or print the annotation.
If this flag is clear, the annotation shall be rendered according to its appearance stream.
- LOCKED = 128¶
Do not allow the annotation to be removed or its properties to be modified but still allow its contents to be modified.
- LOCKED_CONTENTS = 512¶
Do not allow the contents of the annotation to be modified.
- NO_ROTATE = 16¶
Do not rotate the annotation to match the page’s rotation.
- NO_VIEW = 32¶
Do not render the annotation or allow user interaction with it, but still allow printing according to the
AnnotationFlags.PRINTflag.
- NO_ZOOM = 8¶
Do not scale the annotation’s appearance to the page’s zoom factor.
- NULL = 0¶
A default value meaning that no flags are set.
- PRINT = 4¶
Print the annotation when the page is printed unless
AnnotationFlags.HIDDENis set. If clear, do not print the annotation.
- READ_ONLY = 64¶
Do not allow user interaction with the annotation. This is ignored for Widget annotations.
- TOGGLE_NO_VIEW = 256¶
Toggle the
AnnotationFlags.NO_VIEWflag when selecting or hovering over the annotation.
- __new__(value)¶
- class pdfnaut.objects.annotations.AnnotationList[source]¶
Bases:
MutableSequence[Annotation]A mutable sequence representing the list of annotations (the
Annotskey) in a page object.- append(value: Annotation) None[source]¶
Appends an annotation
valueto the list.
- extend(values: Iterable[Annotation]) None[source]¶
Extends the annotation list by appending
valuesto its end.
- insert(index: int, value: Annotation) None[source]¶
Inserts an annotation
valueatindex.
- pop(index: int = -1) Annotation[source]¶
Pops an annotation at
index.
- remove(value: Annotation) None[source]¶
Removes an annotation
valuefrom the list.
- class pdfnaut.objects.annotations.AnnotationReplyType[source]¶
Bases:
EnumThe reply type or relationship between an annotation and its annotation’s
MarkupAnnotation.in_reply_tovalue.- GROUP = 0¶
The annotation shall be grouped with the annotation replied to.
- REPLY = 0¶
The annotation is considered a reply to another annotation.
- class pdfnaut.objects.annotations.LinkAnnotation[source]¶
Bases:
AnnotationA link annotation represents either a hypertext link to a location within the document or an action to perform.
See ISO 32000-2:2020 § 12.5.6.5 “Link annotations” for details.
- __init__(rect: Iterable[float], contents: str, name: str, action: Action | None = None, destination: PdfName | PdfHexString | bytes | Destination | None = None, *, indirect_ref: PdfReference | None = None) None[source]¶
- property action: Action | None¶
The action that shall be performed when the link annotation is triggered.
- property border_style: AnnotationBorderStyle | None¶
The border style specifying the line width and dash pattern that shall be used when drawing the annotation outline.
- color: PdfArray[float] | None¶
An array of 0 to 4 numbers in the range 0.0 to 1.0, representing a color used for the following purposes:
The background of the annotation’s icon when closed.
The title bar of the annotation’s popup window.
The border of a link annotation.
The number of array elements determines the color space in which the color shall be defined: 0 is no color or transparent; 1 is grayscale; 3 is RGB; and 4 is CMYK.
- contents: str¶
The text contents that shall be displayed when the annotation is open or, if this annotation kind does not display text, an alternate description of the annotation’s contents.
- property destination: PdfName | PdfHexString | bytes | Destination | None¶
The destination that shall be displayed when the link annotation is triggered.
- flags: AnnotationFlags¶
Flags specifying various characteristics of the annotation.
- highlight_mode: Literal['N', 'I', 'O', 'P']¶
The annotation’s highlight mode. May be either of the following:
N: No highlight.
I: Invert the contents of the annotation rectangle (default).
O: Invert the annotation’s border/outline.
P: Display the annotation as if it were being pushed below the surface of the page.
- kind: Literal['Text', 'Link', 'FreeText', 'Line', 'Square', 'Circle', 'Polygon', 'PolyLine', 'Highlight', 'Underline', 'Squiggly', 'StrikeOut', 'Caret', 'Stamp', 'Ink', 'Popup', 'FileAttachment', 'Sound', 'Movie', 'Screen', 'Widget', 'PrinterMark', 'TrapNet', 'Watermark', '3D', 'Redact', 'Projection', 'RichMedia']¶
2020 “Table 171 — Annotation types” for details.
- Type:
The kind of annotation. See ISO 32000-2
- language: str | None¶
(PDF 2.0) A language identifier specifying the natural language for all text in the annotation except where overridden by other explicit language specifications
See ISO 32000-2:2020 § 14.9.2 “Natural language specification” for details.
- last_modified: str | None¶
The date and time the annotation was most recently modified. This value should be a PDF date string but PDF processors are expected to accept and display a string in any format.
- class pdfnaut.objects.annotations.MarkupAnnotation[source]¶
Bases:
AnnotationA markup annotation is a type of annotation used primarily to mark PDF documents.
See ISO 32000-2:2020 § 12.5.6.2 “Markup annotations” for details.
- __init__(kind: Literal['Text', 'Link', 'FreeText', 'Line', 'Square', 'Circle', 'Polygon', 'PolyLine', 'Highlight', 'Underline', 'Squiggly', 'StrikeOut', 'Caret', 'Stamp', 'Ink', 'Popup', 'FileAttachment', 'Sound', 'Movie', 'Screen', 'Widget', 'PrinterMark', 'TrapNet', 'Watermark', '3D', 'Redact', 'Projection', 'RichMedia'], rect: Iterable[float], contents: str, name: str, *, indirect_ref: PdfReference | None = None) None[source]¶
- color: PdfArray[float] | None¶
An array of 0 to 4 numbers in the range 0.0 to 1.0, representing a color used for the following purposes:
The background of the annotation’s icon when closed.
The title bar of the annotation’s popup window.
The border of a link annotation.
The number of array elements determines the color space in which the color shall be defined: 0 is no color or transparent; 1 is grayscale; 3 is RGB; and 4 is CMYK.
- contents: str¶
The text contents that shall be displayed when the annotation is open or, if this annotation kind does not display text, an alternate description of the annotation’s contents.
- flags: AnnotationFlags¶
Flags specifying various characteristics of the annotation.
- property in_reply_to: Annotation | None¶
The annotation that this annotation is in reply to.
- kind: Literal['Text', 'Link', 'FreeText', 'Line', 'Square', 'Circle', 'Polygon', 'PolyLine', 'Highlight', 'Underline', 'Squiggly', 'StrikeOut', 'Caret', 'Stamp', 'Ink', 'Popup', 'FileAttachment', 'Sound', 'Movie', 'Screen', 'Widget', 'PrinterMark', 'TrapNet', 'Watermark', '3D', 'Redact', 'Projection', 'RichMedia']¶
2020 “Table 171 — Annotation types” for details.
- Type:
The kind of annotation. See ISO 32000-2
- language: str | None¶
(PDF 2.0) A language identifier specifying the natural language for all text in the annotation except where overridden by other explicit language specifications
See ISO 32000-2:2020 § 14.9.2 “Natural language specification” for details.
- last_modified: str | None¶
The date and time the annotation was most recently modified. This value should be a PDF date string but PDF processors are expected to accept and display a string in any format.
- property reply_type: AnnotationReplyType | str | None¶
The relationship or reply type between this annotation and the one in
in_reply_to.
- class pdfnaut.objects.annotations.TextAnnotation[source]¶
Bases:
MarkupAnnotationA text annotation represents a sticky note attached to a point in the PDF document. When closed, it shall appear as an icon (defined by
TextAnnotation.icon); when open, it shall display a popup window containing the text of the note.See ISO 32000-2:2020 § 12.5.6.4 “Text annotations” for details.
- __init__(rect: Iterable[float], contents: str, name: str, is_open: bool = False, icon: str = 'Note', *, indirect_ref: PdfReference | None = None) None[source]¶
- color: PdfArray[float] | None¶
An array of 0 to 4 numbers in the range 0.0 to 1.0, representing a color used for the following purposes:
The background of the annotation’s icon when closed.
The title bar of the annotation’s popup window.
The border of a link annotation.
The number of array elements determines the color space in which the color shall be defined: 0 is no color or transparent; 1 is grayscale; 3 is RGB; and 4 is CMYK.
- contents: str¶
The text contents that shall be displayed when the annotation is open or, if this annotation kind does not display text, an alternate description of the annotation’s contents.
- flags: AnnotationFlags¶
Flags specifying various characteristics of the annotation.
- icon: Annotated[str, 'name']¶
The name of an icon that shall be used when displaying the annotation.
The icon name may be any of the following standard names or any other supported value.
Standard names: Comment, Key, Note, Help, NewParagraph, Paragraph, and Insert.
- kind: Literal['Text', 'Link', 'FreeText', 'Line', 'Square', 'Circle', 'Polygon', 'PolyLine', 'Highlight', 'Underline', 'Squiggly', 'StrikeOut', 'Caret', 'Stamp', 'Ink', 'Popup', 'FileAttachment', 'Sound', 'Movie', 'Screen', 'Widget', 'PrinterMark', 'TrapNet', 'Watermark', '3D', 'Redact', 'Projection', 'RichMedia']¶
2020 “Table 171 — Annotation types” for details.
- Type:
The kind of annotation. See ISO 32000-2
- language: str | None¶
(PDF 2.0) A language identifier specifying the natural language for all text in the annotation except where overridden by other explicit language specifications
See ISO 32000-2:2020 § 14.9.2 “Natural language specification” for details.
- pdfnaut.objects.annotations.annotation_into(annot: PdfDictionary, *, indirect_ref: PdfReference | None = None) Annotation[source]¶
Converts a mapping
annotinto an instance ofAnnotationor one of its subclasses according to the annotation subtype.
Catalog Objects¶
- class pdfnaut.objects.catalog.DeveloperExtension[source]¶
Bases:
PdfDictionaryAn entry in an extension dictionary.
See ISO 32000-2:2020 § 7.12.3 “Developer extensions dictionary” for details.
- __init__(base_version, level, url=None, revision=None)¶
- base_version: Annotated[str, 'name']¶
The PDF version to which this extension applies. This value shall be consistent with the syntax used for the Version entry of the document catalog dictionary.
- class pdfnaut.objects.catalog.ExtensionMap[source]¶
Bases:
PdfDictionaryA map defining developer extensions in a document.
See ISO 32000-2:2020 § 7.12 “Extensions dictionary” for details.
- query(key: str) DeveloperExtension | list[DeveloperExtension][source]¶
Returns a developer-defined extension (or a sequence of them) for a base prefix
key.
- class pdfnaut.objects.catalog.MarkInfo[source]¶
Bases:
PdfDictionaryInformation relevant to specialized uses of structured PDF documents.
See ISO 32000-2:2020 § 14.7 “Logical structure” for details.
- __init__(marked=False, suspects=False, user_properties=False)¶
- class pdfnaut.objects.catalog.UserAccessPermissions[source]¶
Bases:
IntFlagUser access permissions as specified in the P entry of the document’s standard encryption dictionary.
See ISO 32000-2:2020 “Table 22 - Standard security handler user access permissions” for details.
- ACCESSIBILITY = 512¶
(deprecated in PDF 2.0) Extract content for the purposes of accessibility.
This bit should always be set for compatibility with processors supporting earlier specifications.
- ASSEMBLE_DOCUMENT = 1024¶
For security revision 3 or greater, assemble the document (i.e. insert, rotate, and delete pages, create outlines, etc.), even if
MODIFYis clear.
- COPY_CONTENT = 16¶
Copy or extract text and graphics. Assistive technology should assume this bit as set for its purposes, as per
ACCESSIBILITY.
- FAITHFUL_PRINT = 2048¶
For security revision 3 or greater, print the document in such a way that a faithful digital representation of the PDF can be generated.
If this bit is not set (and
PRINTis set), printing shall be limited to a low-level representation, possibly of lower quality.
- FILL_FORM_FIELDS = 256¶
For security revision 3 or greater, fill existing interactive form fields, even if
MANAGE_ANNOTATIONSis clear.
- MANAGE_ANNOTATIONS = 32¶
Add or modify text annotations, fill interactive form fields and, depending on whether
MODIFYis set, create and modify form fields.
- MODIFY = 8¶
Modify the contents of the document. May be influenced by
MANAGE_ANNOTATIONS,FILL_FORM_FIELDS, andASSEMBLE_DOCUMENT.
- PRINT = 4¶
For security revision 2 or greater, Print the document. If the document uses revision 3 or greater, print quality may be influenced by
FAITHFUL_PRINT.
- __new__(value)¶
- class pdfnaut.objects.catalog.ViewerPreferences[source]¶
Bases:
PdfDictionaryThe viewer preferences dictionary specifying the way a PDF viewer shall display a document on the screen.
See § 12.2, “Viewer preferences” for details.
- __init__(hide_toolbar=False, hide_menubar=False, hide_window_ui=False, fit_window=False, center_window=False, display_doc_title=False, non_full_screen_page_mode='UseNone', direction='L2R', view_area='CropBox', view_clip='CropBox', print_area='CropBox', print_clip='CropBox', print_scaling='AppDefault', duplex=None, pick_tray_by_pdf_size=None, print_page_range=None, num_copies=None)¶
- direction: Literal['L2R', 'R2L']¶
The predominant logical content order for text. Either ‘L2R’ (left to right, default) or ‘R2L’ (right to left). This is effectively a display hint and has no direct effect on the contents of the document.
- display_doc_title: bool¶
(PDF 1.4) Whether the document’s window title should display the title described in the document’s metadata. If False, the title bar should instead display the name of the PDF file containing the document.
- duplex: Literal['Simplex', 'DuplexFlipShortEdge', 'DuplexFlipLongEdge'] | None¶
The paper handling option to use when printing the document. Should be either of:
Simplex: Print single-sided
DuplexFlipShortEdge: Duplex, flip on the short edge of the sheet
DuplexFlipLongEdge: Duplex, flip on the long edge of the sheet
If this value is none, the document producer may choose their own default setting.
- property enforce: list[Literal['PrintScaling']] | None¶
(PDF 2.0) An array of names of viewer preferences that shall be enforced by PDF processors and that shall not be overridden by subsequent selections in the application user interface.
Whether to hide the interactive PDF processor’s menubar when the document is active.
- hide_toolbar: bool¶
Whether to hide the interactive PDF processor’s toolbars when the document is active.
- hide_window_ui: bool¶
Whether to hide UI elements in the document’s window (such as scroll bars or navigation controls), leaving only the document’s contents displayed.
- non_full_screen_page_mode: Literal['UseNone', 'UseOutlines', 'UseThumbs', 'UseOC']¶
The document’s page mode displayed when exiting full-screen mode. This property is only relevant if the PageMode entry in the catalog is set to ‘FullScreen’ and should be ignored otherwise. Accepted values are ‘UseNone’, ‘UseOutlines’, ‘UseThumbs’, and ‘UseOC’.
- num_copies: int | None¶
The number of copies that shall be printed when the print dialog is opened for this file.
If this value is none, the document producer may choose their own default setting, though this setting is usually 1.
- pick_tray_by_pdf_size: bool | None¶
Whether the PDF page size shall be used to select the input paper tray. This setting influences only the preset values used to populate the print dialog. This setting has no effect on systems that do not provide the ability to pick the input tray by size.
If this value is none, the document producer may choose their own default setting.
- print_area: Literal['MediaBox', 'CropBox', 'BleedBox', 'TrimBox', 'ArtBox']¶
(deprecated in PDF 2.0) The name of the page boundary representing the area of a page that shall be rendered when printing the document. Similar to ViewArea, the value should be the key of the relevant page boundary in a page object.
- print_clip: Literal['MediaBox', 'CropBox', 'BleedBox', 'TrimBox', 'ArtBox']¶
(deprecated in PDF 2.0) The name of the page boundary representing to which the contents of a page shall be clipped when printing the document. Similar to ViewArea, the value should be the key of the relevant page boundary in a page object.
- print_page_range: PdfArray[int] | None¶
The page numbers used to initialize the print dialog box. The array should contain an even number of values interpreted as pairs, with each pair specifying the first and last pages in a sub-range of pages to be printed (the first page being denoted by the number 1).
If this value is none, the document producer may choose their own default setting.
- print_scaling: Literal['None', 'AppDefault']¶
The page scaling option to select when a print dialog is displayed for this document.
Accepted values are ‘None’ meaning no page scaling or ‘AppDefault’ (default) indicating that the interactive PDF processor should select its default print scaling value.
- view_area: Literal['MediaBox', 'CropBox', 'BleedBox', 'TrimBox', 'ArtBox']¶
(deprecated in PDF 2.0) The name of the page boundary representing the area of a page that shall be displayed when viewing the document on the screen. The value should be the key of the relevant page boundary in a page object. If no such boundary is defined, the default value (‘CropBox’) is used.
Accepted values are ‘CropBox’, ‘MediaBox’, ‘BleedBox’, ‘TrimBox’, and ‘ArtBox’.
- view_clip: Literal['MediaBox', 'CropBox', 'BleedBox', 'TrimBox', 'ArtBox']¶
(deprecated in PDF 2.0) The name of the page boundary representing to which the contents of a page shall be clipped when viewing the document. Similar to ViewArea, the value should be the key of the relevant page boundary in a page object.
Outline Objects¶
- class pdfnaut.objects.outlines.OutlineItem[source]¶
Bases:
PdfDictionaryAn outline item within the outline tree.
See ISO 32000-2:2020 “Table 151 - Entries in an outline item dictionary” for details.
- __init__(text: str, flags: OutlineItemFlags = OutlineItemFlags.NULL, destination: PdfName | PdfHexString | bytes | Destination | None = None, action: Action | None = None, color: PdfArray[int | float] | None = None, *, pdf: PdfParser | None = None, indirect_ref: PdfReference | None = None) None[source]¶
- property children: OutlineList¶
The immediate children of the outline item.
- close() None[source]¶
If the item has children, closes the outline item and hides the immediate children.
- property color: PdfArray[int | float]¶
The color that shall be used for the outline item text, as an array of RGB color components in the range 0 to 1.
- property destination: PdfName | PdfHexString | bytes | Destination | None¶
The destination that shall be displayed when the item is activated, either a named destination (a name or byte string) or an explicit destination (a
Destinationobject).
- property first: OutlineItem | None¶
The first child item of the outline if any.
- flags: OutlineItemFlags¶
A set of bit flags describing characteristics of the outline item text.
- property last: OutlineItem | None¶
The last child item of the outline if any.
- property next: OutlineItem | None¶
The next item at the current outline level if any.
- open() None[source]¶
If the item has children, opens the outline item and displays the immediate children (and its descendants if they are also visible).
- property parent: OutlineItem | OutlineTree¶
The parent outline item or tree containing this outline.
- property previous: OutlineItem | None¶
The previous item at the current outline level if any.
- class pdfnaut.objects.outlines.OutlineItemFlags[source]¶
Bases:
IntFlagFlags specifying style characteristics for an outline item. See “Table 152 - Outline item flags” for details.
- BOLD = 2¶
Display the outline item text in bold.
- ITALIC = 1¶
Display the outline item text in italic.
- NULL = 0¶
No flags
- __new__(value)¶
- class pdfnaut.objects.outlines.OutlineList[source]¶
Bases:
MutableSequence[OutlineItem]The outline list representing the children of an outline tree or item.
Warning
This class is not designed to be constructed by a user. Using the outline list should be done via
OutlineTreeandOutlineItem.- __init__(pdf: PdfParser, parent: OutlineItem | OutlineTree) None[source]¶
- append(value: OutlineItem) None[source]¶
Appends an outline item
valueto the immediate children of the list.
- count(value: Any) int[source]¶
Returns the amount of times outline item
valueappears in the page list.
- extend(values: Iterable[OutlineItem]) None[source]¶
Appends a list of outline items
valuesto the end of the outline list.
- index(value: Any, start: int = 0, stop: int = sys.maxsize) int[source]¶
Returns the index at which outline item
valuewas first found in the range ofstartincluded tostopexcluded.
- insert(index: int, value: OutlineItem) None[source]¶
S.insert(index, value) – insert value before index
- pop(index: int = -1) OutlineItem[source]¶
Removes the outline item at
indexfrom the immediate children of this outline list.- Raises:
IndexError – The outline list is empty or the item is not in the list.
- Returns:
The outline item that was popped.
- Return type:
- remove(value: OutlineItem) None[source]¶
Removes the first occurrence of outline item
valuein the immediate children of this tree.- Raises:
IndexError – The outline list is empty or the item is not in the list.
- class pdfnaut.objects.outlines.OutlineTree[source]¶
Bases:
PdfDictionaryThe document outline tree containing a hierarchy of outline items that allow navigating throughout the document.
See ISO 32000-2:2020 § 12.3.3 “Document outline” for details.
Warning
This class is not designed to be constructed by a user. To add an outline tree to a document,
PdfDocument.new_outline()should be used.- __init__(pdf: PdfParser, tree: PdfDictionary, tree_ref: PdfReference) None[source]¶
- property children: OutlineList¶
The immediate children of the outline tree.
- property first: OutlineItem | None¶
The first outline item in the tree.
- property last: OutlineItem | None¶
The last outline item in the tree.
- pdfnaut.objects.outlines.flatten_outlines(item: OutlineItem | OutlineTree) Generator[OutlineItem, None, None][source]¶
Yields the immediate children of the outline
item.
- pdfnaut.objects.outlines.get_count(item: OutlineTree | OutlineItem) int[source]¶
Calculates the count of visible items within an outline
itemor tree.
- pdfnaut.objects.outlines.is_outline_tree(item: PdfDictionary) bool[source]¶
Reports whether a dictionary
itemis an outline tree.
- pdfnaut.objects.outlines.update_ancestor_count(item: OutlineTree | OutlineItem) None[source]¶
Recalculates the visible item count for the outline
item, reflecting this count in the ancestors.
Page Objects¶
- class pdfnaut.objects.page.Page[source]¶
Bases:
PdfDictionaryA page in a PDF document (see ISO 32000-2:2020 § 7.7.3.3 “Page objects”).
- Parameters:
size (tuple[float, float]) – The width and height of the physical medium in which the page should be printed or displayed. Values shall be provided in multiples of 1/72 of an inch (points).
pdf (PdfParser, optional) –
The PDF document that this page belongs to.
In typical usage, this value need not be specified. pdfnaut will take care of populating it.
indirect_ref (PdfReference, optional) –
The indirect reference that this page object is referred to by.
As with
pdf, this value need not be specified in typical usage.
- __init__(size: tuple[float, float], *, pdf: PdfParser | None = None, indirect_ref: PdfReference | None = None) None[source]¶
- property annotations: AnnotationList | None¶
All annotations associated with this page. If a page does not specify a list of annotations, this field is none.
- artbox: PdfArray[float] | None¶
A rectangle defining the extent of the page’s meaningful content as intended by the page’s creator.
If none, the artbox is the same as the cropbox.
- bleedbox: PdfArray[float] | None¶
A rectangle defining the region to which the contents of the page shall be clipped when output in a production environment.
If none, the bleedbox is the same as the cropbox.
- property content_stream: ContentStreamTokenizer | None¶
An iterator over the instructions producing the contents of this page.
- cropbox: PdfArray[float] | None¶
A rectangle defining the visible region of the page.
If none, the cropbox is the same as the mediabox.
- mediabox: PdfArray[float]¶
A rectangle defining the boundaries of the physical medium in which the page should be printed or displayed.
- metadata: PdfStream | None¶
A metadata stream, generally written in XMP, containing information about this page.
- resources: PdfDictionary | None¶
Resources required by the page contents.
If the page requires no resources, this should return an empty resource dictionary. If the page inherits its resources from an ancestor, this should return None.
- rotation: int¶
The number of degrees by which the page shall be visually rotated clockwise. The value is a multiple of 90 (by default, 0).
- tab_order: Literal['R', 'C', 'S', 'A', 'W'] | None¶
(optional; PDF 1.5) The tab order to be used for annotations on the page. If present, it shall be one of the following values:
R: Row order
C: Column order
S: Logical structure order
A: Annotations array order (PDF 2.0)
W: Widget order (PDF 2.0)
Trailer Objects¶
- class pdfnaut.objects.trailer.Info[source]¶
Bases:
PdfDictionaryDocument-level metadata representing the structure described in ISO 32000-2:2020 § 14.3.3 “Document information dictionary”.
Since PDF 2.0, most of the attributes here have been deprecated in favor of their equivalents in the document-level metadata stream (see
PdfDocument.xmp_info), with exception ofInfo.creation_dateandInfo.modify_date.- __init__(title=None, author=None, subject=None, keywords=None, creator=None, producer=None, creation_date=None, modify_date=None, trapped=None)¶
- creator: str | None¶
If the document was converted to PDF from another format (ex. DOCX), the name of the PDF processor that created the original document from which it was converted (ex. Microsoft Word).
- modify_date: datetime | None¶
The date and time the document was most recently modified, in human-readable form.
- modify_date_raw: str | None¶
The date and time the document was most recently modified, as a text string.
XMP Objects¶
- class pdfnaut.objects.xmp.XMPDateProperty[source]¶
Bases:
XMPPropertyAn XMP Date property – an ISO 8601 date string, or specifically, the subset specified in https://www.w3.org/TR/NOTE-datetime.
See https://developer.adobe.com/xmp/docs/XMPNamespaces/XMPDataTypes/#date.
- class pdfnaut.objects.xmp.XMPLangAltProperty[source]¶
Bases:
XMPPropertyAn XMP Language Alternative property – an alternative array of simple text items facilitating the selection of a text item based on a desired language.
In this case, this array is represented as a mapping of language names to text items corresponding to each language. The language name should be a value as defined in RFC 3066, composed of a primary language subtag and an optional series of subsequent subtags.
The default value, if known, should be the first item in the dictionary. A default value may also be explicitly marked by setting its language to ‘x-default’.
See https://developer.adobe.com/xmp/docs/XMPNamespaces/XMPDataTypes/#language-alternative.
- class pdfnaut.objects.xmp.XMPListProperty[source]¶
Bases:
XMPPropertyAn array valued XMP property – in this context, either an RDF sequence, used for ordered arrays, or an RDF bag, used for unordered arrays.
See § 7.7 “Array valued XMP properties” in Part 1 of the XMP specification.
- class pdfnaut.objects.xmp.XMPProperty[source]¶
Bases:
objectAn XMP property included in an XMP packet.
- extra¶
Any additional property-specific values.
- local_name¶
The local name of this property.
- namespace_uri¶
The namespace URI of this property.
- class pdfnaut.objects.xmp.XMPTextProperty[source]¶
Bases:
XMPPropertyAn XMP Text property – a possibly empty Unicode string.
- class pdfnaut.objects.xmp.XmpMetadata[source]¶
Bases:
objectAn object representing Extensible Metadata Platform (XMP) metadata, either pertaining to an entire document or to a particular resource.
For information about XMP, see https://developer.adobe.com/xmp/docs/.
- Parameters:
stream (PdfStream, optional) – The XMP packet to parse as a PDF stream. If
streamis None, a new stream containing a packet will be created.- Raises:
PdfParseError – If
streamdoes not contain a valid XMP packet.
- dc_creator¶
The entities primarily responsible for creating this resource.
- dc_description¶
Textual descriptions of this resource as a mapping of language names to items.
- dc_format¶
The MIME type of this resource.
- dc_rights¶
Rights statements pertaining to this resource.
- dc_subject¶
The topics or descriptions specifying the content of this resource.
- dc_title¶
The titles or names given to this resource as a mapping of language names to titles.
- packet¶
The XMP packet as an XML document.
- pdf_keywords¶
Keywords associated with the document.
- pdf_pdfversion¶
The PDF file version. For example, ‘1.0’ or ‘1.3’.
- pdf_producer¶
The name of the tool that produced this PDF document.
- pdf_trapped¶
Whether the document has been modified to include trapping information (see § 14.11.6, “Trapping support”).
- rdf_root¶
The RDF root of the packet being parsed.
- xmp_create_date¶
The datetime this resource was created. This need not match the file system creation date.
- xmp_creator_tool¶
The name of the first known tool that created this resource.
- xmp_metadata_date¶
The datetime this metadata was last modified. It should be the same or more recent than
modify_date.
- xmp_modify_date¶
The datetime this resource was last modified.
- pdfnaut.objects.xmp.get_full_text(element: Element) str[source]¶
Returns the full text content within
element.
- pdfnaut.objects.xmp.lookup_prefix_for_ns(node: Node, namespace: str) tuple[str, Node] | None[source]¶
Locates a namespace prefix matching the
namespaceURI innode. Returns either a tuple of two items containing, in order, the prefix of the namespace URI and the node where it was found, or None, if no prefix is registered for the namespace URI.This is an implementation of https://dom.spec.whatwg.org/#locate-a-namespace-prefix.