Some test text!

Loading...
Guides
Access a PDF page

Access PDF pages to add, copy, delete or rearrange in Python

To access a PDF page.

doc = PDFDoc(filename)

# Access a PDF page
page = doc.GetPage(page_num)

Merge, copy, delete, and rearrange PDF pages
Full code sample which illustrates how to copy pages from one document to another, how to delete, and rearrange pages and how to use ImportPages() method for very efficient copy and merge operations.

About working with pages

A high-level PDF document contains a sequence of Page objects, as illustrated in the following figure:

PDFDoc Page sequence.

To find the number of pages in a PDF document, call PDFDoc.GetPageCount().

To retrieve a specific page of a document, use PDFDoc.GetPage(page_num). Page numbers in the document's page sequence are indexed from 1. If the given page number doesn't index a page in the current document, GetPage(page_num) returns null. For example:

page = doc.GetPage(page_num)
if page is None:
  print("Document does contain page#: %d" % (page_num))
else:
  print("Document does not contain page#: %d" % (page_num))

While GetPage(i) is convenient for retrieving an individual page, it's an inefficient way to enumerate every page of a document. It's better to traverse the pages with a PageIterator.

To do so, simply call PDFDoc.GetPageIterator(). This returns a PageIterator object, which provides HasNext(), Next() and Current() methods. The following code snippet shows how to print the page size for every page in document page sequence:

itr = doc.GetPageIterator()
while itr.HasNext():
  mediabox = itr.Current().GetMediaBox()
  print("Media box: %f, %f, %f, %f", mediabox.x1, mediabox.y1, mediabox.x2, mediabox.y2)
  itr.Next()

(This code finds the page size using the page's media box, which we'll talk more about in the following sections.)

To jump to a specific page with a PageIterator, call PDFDoc.GetPageIterator(page_num). If no such page exists, the index of the page will return 0. For example:

itr = doc.GetPageIterator(page_num)
if itr.Current().GetIndex() > 0:
  print("Document does contain page#: %d" % (page_num))
else:
  print("Document does not contain page#: %d" % (page_num))

Get the answers you need: Support

UPCOMING WEBINAR: Live tech update and run-through. October 21 @ 11am PDT