The PdfWriter Class

class pypdf.PdfWriter(fileobj: Union[str, IO] = '')[source]

Bases: object

Write a PDF file out, given pages produced by another class.

Typically data is added from a PdfReader.

property pdf_header: bytes

Header of the PDF document that is written.

This should be something like b'%PDF-1.5'. It is recommended to set the lowest version that supports all features which are used within the PDF file.

get_object(indirect_reference: Union[None, int, IndirectObject] = None, ido: Optional[IndirectObject] = None) PdfObject[source]
getObject(ido: Union[int, IndirectObject]) PdfObject[source]

Use get_object() instead.

Deprecated since version 1.28.0.

set_need_appearances_writer() None[source]
add_page(page: PageObject, excluded_keys: Iterable[str] = ()) PageObject[source]

Add a page to this PDF file.

Recommended for advanced usage including the adequate excluded_keys.

The page is usually acquired from a PdfReader instance.

Parameters
  • page – The page to add to the document. Should be an instance of PageObject

  • excluded_keys

Returns

The added PageObject.

addPage(page: PageObject, excluded_keys: Iterable[str] = ()) PageObject[source]

Use add_page() instead.

Deprecated since version 1.28.0..

insert_page(page: PageObject, index: int = 0, excluded_keys: Iterable[str] = ()) PageObject[source]

Insert a page in this PDF file. The page is usually acquired from a PdfReader instance.

Parameters
  • page – The page to add to the document.

  • index – Position at which the page will be inserted.

  • excluded_keys

Returns

The added PageObject.

insertPage(page: PageObject, index: int = 0, excluded_keys: Iterable[str] = ()) PageObject[source]

Use insert_page() instead.

Deprecated since version 1.28.0.

get_page(page_number: Optional[int] = None, pageNumber: Optional[int] = None) PageObject[source]

Retrieve a page by number from this PDF file.

Parameters

page_number – The page number to retrieve (pages begin at zero)

Returns

The page at the index given by page_number

getPage(pageNumber: int) PageObject[source]

Use writer.pages[page_number] instead.

Deprecated since version 1.28.0.

getNumPages() int[source]

Use len(writer.pages) instead.

Deprecated since version 1.28.0.

property pages: List[PageObject]

Property that emulates a list of PageObject.

add_blank_page(width: Optional[float] = None, height: Optional[float] = None) PageObject[source]

Append a blank page to this PDF file and returns it.

If no page size is specified, use the size of the last page.

Parameters
  • width – The width of the new page expressed in default user space units.

  • height – The height of the new page expressed in default user space units.

Returns

The newly appended page

Raises

PageSizeNotDefinedError – if width and height are not defined and previous page does not exist.

addBlankPage(width: Optional[float] = None, height: Optional[float] = None) PageObject[source]

Use add_blank_page() instead.

Deprecated since version 1.28.0.

insert_blank_page(width: Optional[Union[float, Decimal]] = None, height: Optional[Union[float, Decimal]] = None, index: int = 0) PageObject[source]

Insert a blank page to this PDF file and returns it.

If no page size is specified, use the size of the last page.

Parameters
  • width – The width of the new page expressed in default user space units.

  • height – The height of the new page expressed in default user space units.

  • index – Position to add the page.

Returns

The newly appended page

Raises

PageSizeNotDefinedError – if width and height are not defined and previous page does not exist.

insertBlankPage(width: Optional[Union[float, Decimal]] = None, height: Optional[Union[float, Decimal]] = None, index: int = 0) PageObject[source]

Use insertBlankPage() instead.

Deprecated since version 1.28.0..

property open_destination: Union[None, Destination, pypdf.generic._base.TextStringObject, pypdf.generic._base.ByteStringObject]

Property to access the opening destination (/OpenAction entry in the PDF catalog). it returns None if the entry does not exist is not set.

Raises

Exception – If a destination is invalid

add_js(javascript: str) None[source]

Add Javascript which will launch upon opening this PDF.

Parameters

javascript – Your Javascript.

>>> output.add_js("this.print({bUI:true,bSilent:false,bShrinkToFit:true});")
# Example: This will launch the print window when the PDF is opened.
addJS(javascript: str) None[source]

Use add_js() instead.

Deprecated since version 1.28.0.

add_attachment(filename: str, data: Union[str, bytes]) None[source]

Embed a file inside the PDF.

Reference: https://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf Section 7.11.3

Parameters
  • filename – The filename to display.

  • data – The data in the file.

addAttachment(fname: str, fdata: Union[str, bytes]) None[source]

Use add_attachment() instead.

Deprecated since version 1.28.0.

append_pages_from_reader(reader: PdfReader, after_page_append: Optional[Callable[[PageObject], None]] = None) None[source]

Copy pages from reader to writer. Includes an optional callback parameter which is invoked after pages are appended to the writer.

append should be prefered.

Parameters
  • reader – a PdfReader object from which to copy page annotations to this writer object. The writer’s annots will then be updated

  • after_page_append – Callback function that is invoked after each page is appended to the writer. Signature includes a reference to the appended page (delegates to append_pages_from_reader). The single parameter of the callback is a reference to the page just appended to the document.

appendPagesFromReader(reader: PdfReader, after_page_append: Optional[Callable[[PageObject], None]] = None) None[source]

Use append_pages_from_reader() instead.

Deprecated since version 1.28.0.

update_page_form_field_values(page: ~pypdf._page.PageObject, fields: ~typing.Dict[str, ~typing.Any], flags: ~pypdf.constants.FieldFlag = FieldFlag.None) None[source]

Update the form field values for a given page from a fields dictionary.

Copy field texts and values from fields to page. If the field links to a parent object, add the information to the parent.

Parameters
  • page – Page reference from PDF writer where the annotations and field data will be updated.

  • fields – a Python dictionary of field names (/T) and text values (/V)

  • flags – An integer (0 to 7). The first bit sets ReadOnly, the second bit sets Required, the third bit sets NoExport. See PDF Reference Table 8.70 for details.

updatePageFormFieldValues(page: ~pypdf._page.PageObject, fields: ~typing.Dict[str, ~typing.Any], flags: ~pypdf.constants.FieldFlag = FieldFlag.None) None[source]

Use update_page_form_field_values() instead.

Deprecated since version 1.28.0.

clone_reader_document_root(reader: PdfReader) None[source]

Copy the reader document root to the writer and all sub elements, including pages, threads, outlines,… For partial insertion, append should be considered.

Parameters

reader – PdfReader from the document root should be copied.

cloneReaderDocumentRoot(reader: PdfReader) None[source]

Use clone_reader_document_root() instead.

Deprecated since version 1.28.0.

clone_document_from_reader(reader: PdfReader, after_page_append: Optional[Callable[[PageObject], None]] = None) None[source]

Create a copy (clone) of a document from a PDF file reader cloning section ‘/Root’ and ‘/Info’ and ‘/ID’ of the pdf.

Parameters
  • reader – PDF file reader instance from which the clone should be created.

  • after_page_append – Callback function that is invoked after each page is appended to the writer. Signature includes a reference to the appended page (delegates to append_pages_from_reader). The single parameter of the callback is a reference to the page just appended to the document.

cloneDocumentFromReader(reader: PdfReader, after_page_append: Optional[Callable[[PageObject], None]] = None) None[source]

Use clone_document_from_reader() instead.

Deprecated since version 1.28.0.

encrypt(user_password: ~typing.Optional[str] = None, owner_password: ~typing.Optional[str] = None, use_128bit: bool = True, permissions_flag: ~pypdf.constants.UserAccessPermissions = UserAccessPermissions.None, user_pwd: ~typing.Optional[str] = None, owner_pwd: ~typing.Optional[str] = None) None[source]

Encrypt this PDF file with the PDF Standard encryption handler.

Parameters
  • user_password – The password which allows for opening and reading the PDF file with the restrictions provided.

  • owner_password – The password which allows for opening the PDF files without any restrictions. By default, the owner password is the same as the user password.

  • use_128bit – flag as to whether to use 128bit encryption. When false, 40bit encryption will be used. By default, this flag is on.

  • permissions_flag – permissions as described in TABLE 3.20 of the PDF 1.7 specification. A bit value of 1 means the permission is grantend. Hence an integer value of -1 will set all flags. Bit position 3 is for printing, 4 is for modifying content, 5 and 6 control annotations, 9 for form fields, 10 for extraction of text and graphics.

write_stream(stream: IO) None[source]
write(stream: Union[Path, str, IO]) Tuple[bool, IO][source]

Write the collection of pages added to this object out as a PDF file.

Parameters

stream – An object to write the file to. The object can support the write method and the tell method, similar to a file object, or be a file path, just like the fileobj, just named it stream to keep existing workflow.

Returns

A tuple (bool, IO)

add_metadata(infos: Dict[str, Any]) None[source]

Add custom metadata to the output.

Parameters

infos – a Python dictionary where each key is a field and each value is your new metadata.

addMetadata(infos: Dict[str, Any]) None[source]

Use add_metadata() instead.

Deprecated since version 1.28.0.

get_reference(obj: PdfObject) IndirectObject[source]
getReference(obj: PdfObject) IndirectObject[source]

Use get_reference() instead.

Deprecated since version 1.28.0.

get_outline_root() TreeObject[source]
get_threads_root() ArrayObject[source]

The list of threads.

See §8.3.2 from PDF 1.7 spec.

Returns

An array (possibly empty) of Dictionaries with /F and /I properties.

property threads: pypdf.generic._data_structures.ArrayObject

Read-only property for the list of threads.

See §8.3.2 from PDF 1.7 spec.

Each element is a dictionaries with /F and /I keys.

getOutlineRoot() TreeObject[source]

Use get_outline_root() instead.

Deprecated since version 1.28.0.

get_named_dest_root() ArrayObject[source]
getNamedDestRoot() ArrayObject[source]

Use get_named_dest_root() instead.

Deprecated since version 1.28.0.

add_outline_item_destination(page_destination: Union[None, PageObject, TreeObject] = None, parent: Union[None, TreeObject, IndirectObject] = None, before: Union[None, TreeObject, IndirectObject] = None, dest: Union[None, PageObject, TreeObject] = None) IndirectObject[source]
add_bookmark_destination(dest: Union[PageObject, TreeObject], parent: Union[None, TreeObject, IndirectObject] = None) IndirectObject[source]

Use add_outline_item_destination() instead.

Deprecated since version 2.9.0.

addBookmarkDestination(dest: PageObject, parent: Optional[TreeObject] = None) IndirectObject[source]

Use add_outline_item_destination() instead.

Deprecated since version 1.28.0.

add_outline_item_dict(outline_item: Union[OutlineItem, Destination], parent: Union[None, TreeObject, IndirectObject] = None, before: Union[None, TreeObject, IndirectObject] = None) IndirectObject[source]
add_bookmark_dict(outline_item: Union[OutlineItem, Destination], parent: Optional[TreeObject] = None) IndirectObject[source]

Use add_outline_item_dict() instead.

Deprecated since version 2.9.0.

addBookmarkDict(outline_item: Union[OutlineItem, Destination], parent: Optional[TreeObject] = None) IndirectObject[source]

Use add_outline_item_dict() instead.

Deprecated since version 1.28.0.

add_outline_item(title: str, page_number: ~typing.Union[None, ~pypdf._page.PageObject, ~pypdf.generic._base.IndirectObject, int], parent: ~typing.Union[None, ~pypdf.generic._data_structures.TreeObject, ~pypdf.generic._base.IndirectObject] = None, before: ~typing.Union[None, ~pypdf.generic._data_structures.TreeObject, ~pypdf.generic._base.IndirectObject] = None, color: ~typing.Optional[~typing.Union[~typing.Tuple[float, float, float], str]] = None, bold: bool = False, italic: bool = False, fit: ~pypdf.generic._fit.Fit = <pypdf.generic._fit.Fit object>, pagenum: ~typing.Optional[int] = None) IndirectObject[source]

Add an outline item (commonly referred to as a “Bookmark”) to the PDF file.

Parameters
  • title – Title to use for this outline item.

  • page_number – Page number this outline item will point to.

  • parent – A reference to a parent outline item to create nested outline items.

  • before

  • color – Color of the outline item’s font as a red, green, blue tuple from 0.0 to 1.0 or as a Hex String (#RRGGBB)

  • bold – Outline item font is bold

  • italic – Outline item font is italic

  • fit – The fit of the destination page.

Returns

The added outline item as an indirect object.

add_bookmark(title: str, pagenum: int, parent: ~typing.Union[None, ~pypdf.generic._data_structures.TreeObject, ~pypdf.generic._base.IndirectObject] = None, color: ~typing.Optional[~typing.Tuple[float, float, float]] = None, bold: bool = False, italic: bool = False, fit: typing_extensions.Literal[/Fit, /XYZ, /FitH, /FitV, /FitR, /FitB, /FitBH, /FitBV] = '/Fit', *args: ~typing.Union[~pypdf.generic._base.NumberObject, ~pypdf.generic._base.NullObject, float]) IndirectObject[source]

Use add_outline_item() instead.

Deprecated since version 2.9.0.

addBookmark(title: str, pagenum: int, parent: ~typing.Union[None, ~pypdf.generic._data_structures.TreeObject, ~pypdf.generic._base.IndirectObject] = None, color: ~typing.Optional[~typing.Tuple[float, float, float]] = None, bold: bool = False, italic: bool = False, fit: typing_extensions.Literal[/Fit, /XYZ, /FitH, /FitV, /FitR, /FitB, /FitBH, /FitBV] = '/Fit', *args: ~typing.Union[~pypdf.generic._base.NumberObject, ~pypdf.generic._base.NullObject, float]) IndirectObject[source]

Use add_outline_item() instead.

Deprecated since version 1.28.0.

add_outline() None[source]
add_named_destination_array(title: TextStringObject, destination: Union[IndirectObject, ArrayObject]) None[source]
add_named_destination_object(page_destination: Optional[PdfObject] = None, dest: Optional[PdfObject] = None) IndirectObject[source]
addNamedDestinationObject(dest: Destination) IndirectObject[source]

Use add_named_destination_object() instead.

Deprecated since version 1.28.0.

add_named_destination(title: str, page_number: Optional[int] = None, pagenum: Optional[int] = None) IndirectObject[source]
addNamedDestination(title: str, pagenum: int) IndirectObject[source]

Use add_named_destination() instead.

Deprecated since version 1.28.0.

Remove links and annotations from this output.

Use remove_links() instead.

Deprecated since version 1.28.0.

remove_images(ignore_byte_string_object: bool = False) None[source]

Remove images from this output.

Parameters

ignore_byte_string_object – optional parameter to ignore ByteString Objects.

removeImages(ignoreByteStringObject: bool = False) None[source]

Use remove_images() instead.

Deprecated since version 1.28.0.

remove_text(ignore_byte_string_object: bool = False) None[source]

Remove text from this output.

Parameters

ignore_byte_string_object – optional parameter

removeText(ignoreByteStringObject: bool = False) None[source]

Use remove_text() instead.

Deprecated since version 1.28.0.

add_uri(page_number: int, uri: str, rect: RectangleObject, border: Optional[ArrayObject] = None, pagenum: Optional[int] = None) None[source]

Add an URI from a rectangular area to the specified page.

This uses the basic structure of add_link()

Parameters
  • page_number – index of the page on which to place the URI action.

  • uri – URI of resource to link to.

  • rectRectangleObject or array of four integers specifying the clickable rectangular area [xLL, yLL, xUR, yUR], or string in the form "[ xLL yLL xUR yUR ]".

  • border – if provided, an array describing border-drawing properties. See the PDF spec for details. No border will be drawn if this argument is omitted.

addURI(pagenum: int, uri: str, rect: RectangleObject, border: Optional[ArrayObject] = None) None[source]

Use add_uri() instead.

Deprecated since version 1.28.0.

Use add_link() instead.

Deprecated since version 1.28.0.

getPageLayout() Literal[/NoLayout, /SinglePage, /OneColumn, /TwoColumnLeft, /TwoColumnRight, /TwoPageLeft, /TwoPageRight]][source]

Use page_layout instead.

Deprecated since version 1.28.0.

set_page_layout(layout: typing_extensions.Literal[/NoLayout, /SinglePage, /OneColumn, /TwoColumnLeft, /TwoColumnRight, /TwoPageLeft, /TwoPageRight]) None[source]

Set the page layout.

Parameters

layout – The page layout to be used

Valid layout arguments

/NoLayout

Layout explicitly not specified

/SinglePage

Show one page at a time

/OneColumn

Show one column at a time

/TwoColumnLeft

Show pages in two columns, odd-numbered pages on the left

/TwoColumnRight

Show pages in two columns, odd-numbered pages on the right

/TwoPageLeft

Show two pages at a time, odd-numbered pages on the left

/TwoPageRight

Show two pages at a time, odd-numbered pages on the right

setPageLayout(layout: typing_extensions.Literal[/NoLayout, /SinglePage, /OneColumn, /TwoColumnLeft, /TwoColumnRight, /TwoPageLeft, /TwoPageRight]) None[source]

Use page_layout instead.

Deprecated since version 1.28.0.

property page_layout: Optional[typing_extensions.Literal[/NoLayout, /SinglePage, /OneColumn, /TwoColumnLeft, /TwoColumnRight, /TwoPageLeft, /TwoPageRight]]

Page layout property.

Valid layout values

/NoLayout

Layout explicitly not specified

/SinglePage

Show one page at a time

/OneColumn

Show one column at a time

/TwoColumnLeft

Show pages in two columns, odd-numbered pages on the left

/TwoColumnRight

Show pages in two columns, odd-numbered pages on the right

/TwoPageLeft

Show two pages at a time, odd-numbered pages on the left

/TwoPageRight

Show two pages at a time, odd-numbered pages on the right

property pageLayout: Optional[typing_extensions.Literal[/NoLayout, /SinglePage, /OneColumn, /TwoColumnLeft, /TwoColumnRight, /TwoPageLeft, /TwoPageRight]]

Use page_layout instead.

Deprecated since version 1.28.0.

getPageMode() Literal[/UseNone, /UseOutlines, /UseThumbs, /FullScreen, /UseOC, /UseAttachments]][source]

Use page_mode instead.

Deprecated since version 1.28.0.

set_page_mode(mode: typing_extensions.Literal[/UseNone, /UseOutlines, /UseThumbs, /FullScreen, /UseOC, /UseAttachments]) None[source]

Use page_mode instead.

Deprecated since version 1.28.0.

setPageMode(mode: typing_extensions.Literal[/UseNone, /UseOutlines, /UseThumbs, /FullScreen, /UseOC, /UseAttachments]) None[source]

Use page_mode instead.

Deprecated since version 1.28.0.

property page_mode: Optional[typing_extensions.Literal[/UseNone, /UseOutlines, /UseThumbs, /FullScreen, /UseOC, /UseAttachments]]

Page mode property.

Valid mode values

/UseNone

Do not show outline or thumbnails panels

/UseOutlines

Show outline (aka bookmarks) panel

/UseThumbs

Show page thumbnails panel

/FullScreen

Fullscreen view

/UseOC

Show Optional Content Group (OCG) panel

/UseAttachments

Show attachments panel

property pageMode: Optional[typing_extensions.Literal[/UseNone, /UseOutlines, /UseThumbs, /FullScreen, /UseOC, /UseAttachments]]

Use page_mode instead.

Deprecated since version 1.28.0.

add_annotation(page_number: int, annotation: Dict[str, Any]) None[source]
clean_page(page: Union[PageObject, IndirectObject]) PageObject[source]

Perform some clean up in the page. Currently: convert NameObject nameddestination to TextStringObject (required for names/dests list)

Parameters

page

Returns

The cleaned PageObject

append(fileobj: Union[str, IO, PdfReader, Path], outline_item: Union[str, None, PageRange, Tuple[int, int], Tuple[int, int, int], List[int]] = None, pages: Union[None, PageRange, Tuple[int, int], Tuple[int, int, int], List[int]] = None, import_outline: bool = True, excluded_fields: Optional[Union[List[str], Tuple[str, ...]]] = None) None[source]

Identical to the merge() method, but assumes you want to concatenate all pages onto the end of the file instead of specifying a position.

Parameters
  • fileobj – A File Object or an object that supports the standard read and seek methods similar to a File Object. Could also be a string representing a path to a PDF file.

  • outline_item – Optionally, you may specify a string to build an outline (aka ‘bookmark’) to identify the beginning of the included file.

  • pages – Can be a PageRange or a (start, stop[, step]) tuple or a list of pages to be processed to merge only the specified range of pages from the source document into the output document.

  • import_outline – You may prevent the source document’s outline (collection of outline items, previously referred to as ‘bookmarks’) from being imported by specifying this as False.

  • excluded_fields – Provide the list of fields/keys to be ignored if /Annots is part of the list, the annotation will be ignored if /B is part of the list, the articles will be ignored

merge(position: Optional[int], fileobj: Union[Path, str, IO, PdfReader], outline_item: Optional[str] = None, pages: Optional[Union[str, PageRange, Tuple[int, int], Tuple[int, int, int], List[int]]] = None, import_outline: bool = True, excluded_fields: Optional[Union[List[str], Tuple[str, ...]]] = ()) None[source]

Merge the pages from the given file into the output file at the specified page number.

Parameters
  • position – The page number to insert this file. File will be inserted after the given number.

  • fileobj – A File Object or an object that supports the standard read and seek methods similar to a File Object. Could also be a string representing a path to a PDF file.

  • outline_item – Optionally, you may specify a string to build an outline (aka ‘bookmark’) to identify the beginning of the included file.

  • pages – can be a PageRange or a (start, stop[, step]) tuple or a list of pages to be processed to merge only the specified range of pages from the source document into the output document.

  • import_outline – You may prevent the source document’s outline (collection of outline items, previously referred to as ‘bookmarks’) from being imported by specifying this as False.

  • excluded_fields – provide the list of fields/keys to be ignored if /Annots is part of the list, the annotation will be ignored if /B is part of the list, the articles will be ignored

Raises

TypeError – The pages attribute is not configured properly

add_filtered_articles(fltr: Union[Pattern, str], pages: Dict[int, PageObject], reader: PdfReader) None[source]

Add articles matching the defined criteria.

Parameters
  • fltr

  • pages

  • reader

close() None[source]

To match the functions from Merger.

find_outline_item(outline_item: Dict[str, Any], root: Optional[List[Union[Destination, List[Union[Destination, List[Destination]]]]]] = None) Optional[List[int]][source]
find_bookmark(outline_item: Dict[str, Any], root: Optional[List[Union[Destination, List[Union[Destination, List[Destination]]]]]] = None) Optional[List[int]][source]

Deprecated since version 2.9.0: Use find_outline_item() instead.

reset_translation(reader: Union[None, PdfReader, IndirectObject] = None) None[source]

Reset the translation table between reader and the writer object.

Late cloning will create new independent objects.

Parameters

reader – PdfReader or IndirectObject refering a PdfReader object. if set to None or omitted, all tables will be reset.

set_page_label(page_index_from: int, page_index_to: int, style: Optional[PageLabelStyle] = None, prefix: Optional[str] = None, start: Optional[int] = 0) None[source]

Set a page label to a range of pages.

Page indexes must be given starting from 0. Labels must have a style, a prefix or both. If to a range is not assigned any page label a decimal label starting from 1 is applied.

Parameters
  • page_index_from – page index of the beginning of the range starting from 0

  • page_index_to – page index of the beginning of the range starting from 0

  • style

    The numbering style to be used for the numeric portion of each page label: ‘/D’ Decimal arabic numerals ‘/R’ Uppercase roman numerals ‘/r’ Lowercase roman numerals ‘/A’ Uppercase letters (A to Z for the first 26 pages,

    AA to ZZ for the next 26, and so on)

    ’/a’ Lowercase letters (a to z for the first 26 pages,

    aa to zz for the next 26, and so on)

  • prefix – The label prefix for page labels in this range.

  • start – The value of the numeric portion for the first page label in the range. Subsequent pages are numbered sequentially from this value, which must be greater than or equal to 1. Default value: 1.