Common Objects Reference

The common objects module contains a repository of utilities used within pdfnaut.

Dictionary Models

class pdfnaut.common.dictmodels.Field[source]

Bases: object

__init__(key: str | None = None, default: Any = MISSING, default_factory: Callable[[], Any] | _MISSING_TYPE = MISSING, encoder: Callable[[Any], Any] | None = None, decoder: Callable[[Any], Any] | None = None, init: bool | None = None, repr_: bool | None = None, metadata: dict[str, Any] | None = None) None[source]
property key: str
pdfnaut.common.dictmodels.build_repr(cls: type[T], repr_accessors: list[Accessor])[source]
pdfnaut.common.dictmodels.create_accessors(cls, *, parent_init: bool = True, parent_repr: bool = True) list[Accessor][source]
pdfnaut.common.dictmodels.defaultize(cls: type[T]) T[source]

Returns an instance of a dictmodel cls initialized with default accessor values.

pdfnaut.common.dictmodels.dictmodel(_cls: type[_T] | None = None, *, init: bool = True, repr_: bool = True)[source]
pdfnaut.common.dictmodels.field(key: str | None = None, default: Any = MISSING, default_factory: Callable[[], Any] | _MISSING_TYPE = MISSING, encoder: Callable[[Any], Any] | None = None, decoder: Callable[[Any], Any] | None = None, init: bool | None = None, repr_: bool | None = None, metadata: dict[str, Any] | None = None) Any[source]

Defines a field in a dictmodel.

Parameters:
  • key (str, optional) – The name of the key that will be accessed by this field. If not specified, the key will be the title-cased version of the field name.

  • default (Any, optional) – The default value of the field if it is not specified. If no default is specified, the field is assumed to be required.

  • default_factory (Callable[[], Any], optional) – A callable that takes no arguments and produces the default value of the field. This can be used to specify default mutable values.

  • encoder (Callable[[Any], Any], optional) – A callable that takes one argument and transforms the value that will be set for the field in the underlying dictionary.

  • decoder (Callable[[Any], Any], optional) – A callable that takes one argument and transforms the value returned when getting the field from the underlying dictionary.

  • init (bool | None, optional) – Whether this field will appear as part of the class constructor. If not specified, it defaults to the value of the init argument in the dictmodel.

  • repr (bool | None, optional) – Whether this field will appear as part of the class representation. If not specified, it defaults to the value of the repr_ argument in the dictmodel.

  • metadata (dict[str, Any], optional) – Additional metadata for this field which may be used by the accessor.

Note

default and default_factory are mutually exclusive. If both are specified, default_factory takes precedence.

The encoder and decoder argument must both be specified if used. These values are only honored if the field type is not itself already handled by an accessor; otherwise, it is ignored.

pdfnaut.common.dictmodels.snake_to_title_case(value: str) str[source]

Accessors

class pdfnaut.common.accessors.Accessor[source]

Bases: Protocol

__init__(*args, **kwargs)
field: Field
class pdfnaut.common.accessors.DateAccessor[source]

Bases: object

An accessor defining a key whose value is a date (see ISO 32000-2:2020 § 7.9.4 “Dates”).

__init__(field: Field) None[source]
class pdfnaut.common.accessors.ModelAccessor[source]

Bases: object

An accessor defining a key whose value is a dictionary represented by a dictmodel.

__init__(field: Field) None[source]
class pdfnaut.common.accessors.NameAccessor[source]

Bases: object

An accessor defining a key whose value may be any of a set of names.

__init__(field: Field) None[source]
class pdfnaut.common.accessors.StandardAccessor[source]

Bases: object

An accessor defining a key whose value is a type that does not require a complex mapping such as booleans, numbers, and certain name objects.

Text strings and dates have special handling and are better served by the TextStringAccessor and DateAccessor classes respectively.

__init__(field: Field) None[source]
class pdfnaut.common.accessors.TextStringAccessor[source]

Bases: object

An accessor defining a key whose value is a text string.

See ISO 32000-2:2020 § 7.9.2.2 “Text string type” for details.

__init__(field: Field) None[source]
class pdfnaut.common.accessors.TransformAccessor[source]

Bases: object

An accessor defining a key whose value is handled by user-provided encoder and decoder functions.

__init__(field: Field) None[source]
pdfnaut.common.accessors.lookup_accessor_by_field(field: Field) tuple[type[Accessor], dict[str, Any]][source]
pdfnaut.common.accessors.lookup_accessor_by_type(value_type: type) tuple[type[Accessor], dict[str, Any]][source]

Dates

Utilities for parsing and encoding date formats: ISO 8601 and ISO 8824.

pdfnaut.common.dates.encode_iso8601(date: datetime, *, full: bool = True) str[source]

Encodes a datetime.datetime object into a date string conforming to the ISO 6801 profile specified in https://www.w3.org/TR/NOTE-datetime.

If full is True, this function will encode all date and time values. Otherwise, the function will perform partial encoding, only including components that aren’t their default values.

pdfnaut.common.dates.encode_iso8824(date: datetime, *, full: bool = True) str[source]

Encodes a datetime.datetime object into an ISO 8824 date string suitable for storage in a PDF file.

If full is True, this function will encode all date and time values. Otherwise, the function will perform partial encoding, only including components that aren’t their default values.

pdfnaut.common.dates.has_date(date: datetime) bool[source]

Returns whether date has a date component. In this case, if either the year, month or day isn’t a default value.

pdfnaut.common.dates.has_time(date: datetime) bool[source]

Returns whether date has a time component. In this case, if either the hour, minute, second or microsecond isn’t a default value.

pdfnaut.common.dates.has_timezone(date: datetime) bool[source]

Returns whether date specifies a timezone other than UTC.

pdfnaut.common.dates.parse_iso8601(date_string: str) datetime[source]

Parses a date string conforming to the ISO 8601 profile specified in https://www.w3.org/TR/NOTE-datetime into a datetime.datetime object.

pdfnaut.common.dates.parse_iso8824(date_string: str) datetime[source]

Parses an ISO/IEC 8824 date string into a datetime.datetime object (for example, D:20010727133720).

This is the type of date string described in ISO 32000-2:2020 § 7.9.4 “Dates”.

Utils

pdfnaut.common.utils.clone_into_document(dest: PdfParser, root: PdfObject | PdfStream, *, ignore_keys: list[str] | None = None) PdfObject | PdfStream[source]

Clones an object root and its contents into document dest. Returns the cloned object.

If the root object is a dictionary and the ignore_keys argument is provided, those keys will be ignored when cloning the root object.

Cloning of an object is performed by deep-copying each element contained in it. When a reference is found, it is determined whether it is suitable for cloning into the document.

A reference is determined suitable for cloning if it does not refer back to the root object. If it is unsuitable, a placeholder is added if the reference is root itself. If the reference may point back to the object (such as the reference being for a page tree), it is nulled.

If the reference is suitable, its contents are added into the document and the new reference replaces the old reference in the object.

pdfnaut.common.utils.copy_object(obj: PdfObject | PdfStream) PdfObject | PdfStream[source]

Performs a deep copy of a PDF object obj. Returns the copied object.

Deep copying works by creating a new object for the container then adding a copy of each element it contains into the new object.

Numbers, literal strings, booleans, and the null object are not copied and are returned as is. Unlike clone_in_document(), when a reference is found, it is simply copied into the object without modifying the referred object.

pdfnaut.common.utils.ensure_bytes(contents: PdfHexString | bytes) bytes[source]

Returns the decoded value of contents if it is an instance of PdfHexString, otherwise returns contents as is.

pdfnaut.common.utils.ensure_object(obj: PdfReference[R] | R) R[source]

Resolves obj to a direct object if obj is an instance of PdfReference. Otherwise, returns obj as is.

pdfnaut.common.utils.generate_file_id(filename: str, content_size: int) PdfHexString[source]

Generates a file identifier using filename and content_size as described in ISO 32000-2:2020 § 14.4 “File identifiers”.

File identifiers are values that uniquely separate a revision of a document from another. The file identifier is generated using the same information specified in the standard, that is, the current time, the file path and the file size in bytes.

pdfnaut.common.utils.get_closest(values: Iterable[int], target: int) int[source]

Returns the integer in values closest to target.

pdfnaut.common.utils.is_page_or_page_tree(obj: PdfObject | PdfStream) bool[source]

Reports whether an object obj is a page object or a page tree node.