Open Source or Proprietary — What PDF Viewer Engine is Right for My Application?

By Adam Pez | 2022 Oct 31

7 min

Defining PDF Rendering Library Options

Copied to clipboard

A PDF rendering engine or “core” is the foundational piece in your viewer application architecture; it makes PDF files accessible for viewing and manipulation in an interactive workflow. Choosing the right one for your needs is therefore key, because the core serves as the cornerstone of everything an application does.

This guide discusses a few different PDF engines: some offered by commercial vendors, others available as free downloads. We’ll use Wikipedia to lay down ground terminology.

Open source is source code made available to everybody to modify, enhance, and distribute.

Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized software development model that encourages open collaboration.
—
Wikipedia

Proprietary software, on the other hand, is non-free source code that only an organization, person, or team can create, edit, inspect, or change.

Proprietary software, also known as non-free software or closed-source software, is computer software for which the software's publisher or another person reserves some licensing rights to use, modify, share modifications, or share the software, restricting user freedom with the software they lease. It is the opposite of open-source or free software. Non-free software sometimes includes patent rights.
—
Wikipedia

With definitions out of the way, let’s create some library categories to consider:

Unmodified Open Source
Modified Open Source
Full Proprietary Engine

1. Unmodified Open Source — For Everyday PDF Reading

Copied to clipboard

When it comes to your PDF technology needs, open source can serve as a practical and proven starting point.

A popular option is PDFium, Google’s fork of Foxit’s PDF viewer in C++. PDFium is used today in Chrome and Windows Edge; SaaS vendors such as Dropbox use it for PDF previews, with a server-side renderer.

Apryse's Director of Product, Andrey compares JavaScript PDF viewer libraries, with samples on GitHub

PDF.js is another popular open-source option. Created by Mozilla and now maintained by its community, PDF.js loads, renders and displays PDFs using JavaScript. It is used by many companies, especially startups, to add interactive PDF viewing to a web application or website.

Read on LinkedIn Learning's publishing flow with PDF.js for viewing

Benefits of Unmodified Open-Source Libraries

The benefits of these two engines are not hard to see — both PDFium and PDF.js are distributed under permissive licenses (MIT and Apache 2.0, respectively). So, they make PDF rendering freely accessible to developers.

Using these libraries directly from the repo, organizations also gain the benefits of open source:

Community feedback for enhancing and improving the code base for PDF rendering and viewing
Community resources for developing and testing PDF rendering and viewing

When is Unmodified Open Source Not Enough?

Unmodified open source is a great entry level option. But when your users start requesting more features or improved rendering performance, you’ll need to consider other options that offer more than just the ability to read PDFs.

Consider other options when you need:

Additional file formats such as MS Office Excel, PowerPoint, and Word
Additional platforms (e.g., Android, iOS, and Windows devices)
Professional workflow capabilities, like annotations, comparison, template generation, editing, signing, redaction, and so on
Ability to control the UX, including the ability of users to download their PDFs

A Note on Bad PDFs and User Experience

Using unmodified open source brings the disadvantage of bad PDFs, which introduce problems that interrupt user productivity.

[Parsing & extracting] is relatively straightforward until you get to bad PDFs. There are a lot of bad PDFs out there that don't follow the specification. A lot of our code is going back to handle these strange cases.
—
Former Mozilla and PDF.js Developer Brendan Dahl

PDFs are an incredibly complex file format; this is especially so given that a PDF can be generated a hundred different ways, all of which a renderer needs to handle gracefully.
—
LinkedIn Learning developer working with PDF.js

PDFs may be malformed, corrupted, or memory intensive — especially on mobile devices and in a web browser. A few problems include:

Incorrect fonts and rotations
Imprecise vector lines
Color errors in branded materials
Text selection and highlight accuracy errors
Performance problems when scrolling, panning, or zooming on a page
Crashes or freezes due to memory-intensive files

Customers in high-pressure industries need a professional, commercial rendering engine that will sidestep these issues, and they need engineers to make fixes quickly when issues arise. In contrast, open-source communities do not treat bugs with the same urgency, and you can wait months for fixes and years for requested features.

2. Modified Open Source: Should You Commit — or Fork?

Copied to clipboard

To mature their solutions, organizations can choose to continue to work with open source. You can modify open source, adding features, fixing bugs, and tuning performance.

There are two broad pathways to modify the engine source code.

Commit to the original open-source project and wait for your changes to be approved.
Fork the project and take ownership.

Each path has its perks — and tradeoffs. For commercial software developers, both present a Catch-22.

On the one hand, contributing to the community ensures you continue to benefit from community feedback and testing on what you add to the engine. But as a business, you’d be giving away competitive advantage if you invest into PDF specialization.

The other option is to fork the library and add your additions privately, or as part of a new community. Both PDF.js and PDFium, distributed under permissive licenses, allow this.

The Challenge with Forking

After forking, you need to change the name of your library. And forking fragments the community.

There is a strong social pressure against forking projects. It does not happen except under plea of dire necessity, with much public self-justification, and requires re-naming.
—
Eric Raymond writes in ‘Homesteading the Noosphere’ in "The Cathedral & the Bazaar”

Forking weakens the value proposition of open source:

Improvements you make no longer benefit from the security and testing provided by the browser vendors and community — you get fewer eyes on your code for feedback and testing.
It adds technical debt if you continue to draw from the original repo, since merging from upstream is more complicated.
Your changes might break when the community updates.
Changes might introduce rendering regressions or other bugs that will weaken the solution’s stability.

3. A Proprietary PDF SDK Engine

Copied to clipboard

Which brings us to our third and final category — a proprietary PDF SDK engine. This is a huge investment when built from the ground up, as it takes years of development and continuous spending on testing and audits to ensure security and UX. Be prepared for the sticker price of such an engine to reflect the significant development investment.

A Proprietary PDF SDK Engine Means No Shortcuts

A developer team can build up document format expertise, which they can then pass on to clients in the form of:

Rendering performance on the most demanding documents, including highly technical, vector-based documents and on all platforms, including mobile and web.
Accuracy when dealing with complicated color models, including CMYK in RGB-based browsers.
Configurability and control – anything from specifying color transforms, specific font substitution behavior, caching, and more.

Meeting Your Most Advanced Document Processing Needs

In addition, a proprietary PDF SDK can support your most advanced document processing needs, scaling as your needs grow. For example, with the Apryse SDK, we’re able to provide customers:

The ability to dynamically load a variety of documents, including PDF, MS Office files, and images in a web or mobile app viewer, no conversion servers required
Powerful client-side rendering and viewing for your most demanding PDFs, including huge vector drawings sourced from desktop CAD programs
Support for all mobile platforms (Android, iOS, Windows), JS frameworks, and cross-platform languages (React Native, Flutter, and Xamarin) to streamline development
Advanced document processing right in a web or mobile app client, for extra security (true PDF redaction, true PDF editing, and much, much more)

The Bottom Line - Which is Best for Your Project?

Copied to clipboard

Throughout this guide — you've probably been asking: which is better for me, a proprietary or open-source engine? It’s a tough question, as no one size fits all. Whether you go with a proprietary PDF SDK engine or open-source library depends on the magnitude and longevity of your project, as well as your specific requirements.

At Apryse, we deal with the full spectrum of document management professionals. Our products include iText — an open-source PDF processing library that we distribute under a dual license arrangement AGPLv3 — and a proprietary PDF SDK. We see the benefits of open source in our own iText product. We also understand when customers need to reap all the benefits of a proprietary platform: code base stability and responsive support, developer experience, and cutting-edge features.

We (and thousands of customers) are huge fans and advocates of our own cross-platform PDF SDK, built from ground up and refined over the last 20+ years. I could go on and rattle off a brag sheet of its feature specs and logos. However, results speak for themselves, and our customers also speak for us.

Visit our WebViewer showcase to see what a world-class PDF SDK can do for your project.

And when you’re done, please drop us a line and we'd be happy to discuss your project and document technology needs.