Some test text!
Nov 8 2022
by Valerie Yates
Like any format, PDF comes with its own jargon, which can all seem very complicated. One set of terms we see frequently are PDF rendering and PDF viewing. As a commercial PDF SDK vendor, we see these terms confused very often. So, in this post, we take a closer look.
PDF rendering means turning PDF file contents into images that can then be displayed in a viewer application. In contrast: PDF viewing is what we call it when rendered images are displayed in a viewer application viewport. Once a PDF file is rendered, you can interact with the content. All PDF viewers are built on top of rendering functionality.
PDF is much more than simply a display format. In addition to content, a PDF file contains instructions about how to organize and lay out that content.
To display a PDF file, its contents must first be rendered into an image (i.e., display) format. PDF rendering is the process of turning your PDF into an image you can view on screen.
A PDF library first needs to decompress the binary PDF file and parse its contents. Next, the PDF rendering engine converts parsed PDF file contents into drawing operations.
Most often, the graphics in a PDF document will be encoded as one of two types of data: raster or vector.
A raster specifies what goes into each pixel of an image. Rasterizing a PDF file converts the document into an image or "bitmap" commands. A raster image is made up of pixels.
Vectors are mathematical commands to draw geometry. You can have vector lines, vector shapes, even vector text. For example, a PDF vector content stream can include instructions to draw a straight line of Y length, turn X degrees, and repeat the line. Or with a little extra math, it may specify a curve in the line:
A Bézier curve with control and anchor points. (Source: Wikimedia commons)
Most PDFs are vector files, but PDFs can also be saved as raster files. Most PDFs created from CAD (Computer-Aided Design) are vector-based because they contain more data that make it easier to work with drawing and model content, and the display of the geometry remains sharp when you zoom in with vector. Measurements and takeoffs (as well as their calibration) also retain precision in a vector PDF because you can use Snap to Content to snap to the vector lines in the PDF.
PDF viewing is when the user interacts with displayed (i.e., rendered) content. Since PDF content is interactive within the viewport, the PDF renderer needs to be responsive also.
A PDF rendering and viewer library typically mounts controls in the viewer UI that allow the user to navigate the content (pan and scroll across, and zoom into content). A PDF library may also provide APIs to control the rendering and viewer behavior programmatically.
During navigation, the viewer communicates to the engine what parts of the document it needs to rasterize and at what scale. As the viewer requests information, the lines, text, and shape instructions that make up a PDF page are painted in memory and the result is used to update the display viewport.
But how does the PDF rendering engine actually perform its job under the hood? How does PDF rendering work?
There are many methods to process PDF for display. These range from server-backed solutions, leveraging open source to serve images down to the application client, to native desktop renderers that use powerful native PDF technology.
Typically, a PDF rendering solution works by unpacking ("parsing") PDF objects for the page. And then there are many roads you can take from there to get your PDF pages to display.
The most challenging PDF rendering is performed with complex PDFs in the browser, especially in a mobile browser client, due to browser limits on memory management, and other issues, such as mismatches between the browser technology graphics model and that of PDF, causing inaccuracies in display. Since rendering PDFs of all types in browsers is more and more important today, we'll take a closer look:
One method to support in-browser rendering in a web app or website is to rely on native web technologies, such as <canvas>, supported by the browser vendors. Another method, offering faster display, secure client-side integration, and improved accuracy, is to embed native PDF SDK code, ported to the browser via modern client-side technologies such as WebAssembly (WASM).
For our JS PDF viewer and editor, WebViewer, the display viewport still consists of the
In-browser document rendering via an embedded JS and WASM component
An additional technique to accelerate PDF rendering is
Building a custom viewing application directly on top of PDFTron's
For example, with DocumentViewer, you can programmatically control viewer behavior such as
To ensure a fast yet precise UX, the Core PDFTron engine applies additional techniques as the document displays, including viewport rendering, page caching, and other strategies. For example, depending on the PDF file being viewed, the PDFTron SDK may use one or both of Progressive Rendering and Pre-Rendering:
The PDFTron SDK supports each step of the PDF rendering pipeline—from initial PDF file parsing, to final rasterization in the viewport as an image. Along the way, it provides several classes, giving developers the option to build a custom viewing application atop the PDFTron Core or to control the user experience within the pre-built viewer. The customizable, out-of-box UI is also available on
Interactive rendering is handled by the
Many aspects distinguish a good rendering engine from a poor one. And there are several ways to get an engine, from using open source to leveraging a proprietary PDF SDK rendering library.
In summary: PDF viewing is when rendered images are actually displayed in your viewer UI, typically with controls so you can pan, scroll, and zoom into content!
Rendering is extracting, parsing, and converting PDF file binary into images you can see in the viewer viewport.
There is obviously interaction between the rendering engine and viewer layer; the viewer communicates to the engine the images it needs to display in the viewport—the rendering engine creates those images dynamically!
Want to learn even more about PDF rendering and viewing technology? Check out:
Much like the PDF specification itself, our powerful in-house rendering engine came into being several decades ago and has since seen constant enhancements and refinements. We’d love to hear about any PDF rendering and viewing challenges you’ve experienced or ideas that may deliver yet more benefits for PDF-centric workflows. Feel free to
This guide shows you your options to build a Flutter PDF viewer and your potential best path forward towards a professional solution.
This blog discusses the three options for embedding PDF files or a PDF viewer in a website that are available to you, starting with the simplest and ending with the PDF viewing bells and whistles.
A tutorial on how to extract text from a PDF using Python and the PDFTron SDK for machine learning.