The DocumentInformation Class
- class pypdf.DocumentInformation[source]
Bases:
DictionaryObjectA class representing the basic document metadata provided in a PDF File. This class is accessible through
PdfReader.metadata.All text properties of the document metadata have two properties, eg. author and author_raw. The non-raw property will always return a
TextStringObject, making it ideal for a case where the metadata is being displayed. The raw property can sometimes return aByteStringObject, if pypdf was unable to decode the string’s text encoding; this requires additional safety in the caller and therefore is not as commonly accessed.- getText(key: str) Optional[str][source]
Use the attributes (e.g.
title/author).Deprecated since version 1.28.0.
- property title: Optional[str]
Read-only property accessing the document’s title.
Returns a
TextStringObjectorNoneif the title is not specified.
- property author: Optional[str]
Read-only property accessing the document’s author.
Returns a
TextStringObjectorNoneif the author is not specified.
- property subject: Optional[str]
Read-only property accessing the document’s subject.
Returns a
TextStringObjectorNoneif the subject is not specified.
- property creator: Optional[str]
Read-only property accessing the document’s creator.
If the document was converted to PDF from another format, this is the name of the application (e.g. OpenOffice) that created the original document from which it was converted. Returns a
TextStringObjectorNoneif the creator is not specified.
- property producer: Optional[str]
Read-only property accessing the document’s producer.
If the document was converted to PDF from another format, this is the name of the application (for example, OSX Quartz) that converted it to PDF. Returns a
TextStringObjectorNoneif the producer is not specified.
- property creation_date: Optional[datetime]
Read-only property accessing the document’s creation date.
- property creation_date_raw: Optional[str]
The “raw” version of creation date; can return a
ByteStringObject.Typically in the format
D:YYYYMMDDhhmmss[+Z-]hh'mmwhere the suffix is the offset from UTC.