Unlock the Power of Direct PDF Editing with WebViewer 10.7

Guide to Evaluating PDF.js

By Adam Pez | 2019 Aug 09

Sanity Image
Read time

18 min

Update (2023-01-31): Some elements of this guide are no longer up to date. For example, the pdf.js community and Mozilla have since added Form Filling functionality and Digital Signature display. We've made a few additions to this guide to make recent changes easy for you to see.

If you view a PDF embedded in a web page, you would likely use the browser’s built-in PDF reader. On Firefox, that’s PDF.js, a popular open-source JavaScript library published by Mozilla in 2011.

PDF.js uses pure client-side JavaScript to render PDF file content into an HTML5 <canvas> element. Developers, therefore, appreciate PDF.js for the simplicity of its dependencies as well as how basic UI elements can be easily restyled via the project CSS and HTML files.

Devs will often embed PDF.js to enable web viewing capabilities in their apps.

However, as a vendor of a commercial PDF SDK, we hear from customers who come to us seeking an alternative after implementing PDF.js and later discovering that it cannot meet their needs. To help you avoid making the same mistakes, here are some useful tips and perspectives sourced from real-world PDF.js deployments to help in your PDF.js evaluation.

We wanted to keep our analysis as objective as possible. So we did our research. And what the facts show is that PDF.js works pretty good in some situations, and in others, it may not be ideal.

Criteria

Copied to clipboard

When evaluating PDF.js, here are a few criteria you may wish to consider:

  • Out-of-the-box features
  • Complexity of adding and supporting features
  • Supported browsers
  • Text select and text search UX
  • Security
  • Support
  • Project status and trajectory
  • Accuracy, reliability, and speed

Read the rest of this article for detailed answers — or skip to the end to see our findings on when and where PDF.js works best, based on the experiences of our customers.

Functionality and Features

Copied to clipboard

Out-of-box Functionality

What you get out of the box with PDF.js are the following three layers to allow for basic PDF rendering and viewing:

  • A core layer to interpret binary PDF content via an HTML5 Web Worker. (Engaging with this layer is considered advanced usage as it assumes an advanced understanding of the PDF spec.)
  • A display interface to render a PDF page into an HTML <canvas> element and extract page information.
  • A ready-to-use PDF viewer that supports basic features like search, rotate, print, page thumbnails, and so on.

A PDF Reader Only

One crucial thing to bear in mind is that PDF.js was designed as a PDF reader only; therefore, it does not support features that require editing of PDFs such as direct annotation, page manipulation, and redaction — as emphasized in the FAQ:

Quote from pdf.js community contributor

And on GitHub:

Quote from pdf.js community contributor

Out of Scope and Unsupported Features

As a result, some features are categorized as outside the scope of a PDF reader. These are not given as much attention by the main project contributors. Other features are unsupported because the open-source community is still working on them.

Unsupported Features and Reason

Form Filling — Open issue for 3 years | Update: Now supported

Direct annotation (Add, edit, and remove) — Out of Scope

Page manipulation (Add, merge, and remove) — Out of Scope

Signatures — Open for 7 years | Update: Now display

Toggleable visual layers (OCGs) — Open for 8 years | Update: API method added

Night mode — Open for 7 years | Update: API method added

Advanced Rendering Features

Overprint Support — Open 8 years | Update: Marked as “won’t fix”

Canvas tiling — Open 6 years

Color profile management — Open 6 years

Browser Behavior and Support

Copied to clipboard

PDF.js has the advantage (and disadvantage) of relying on the browser for rendering.

This means PDF.js can initialize very quickly compared to many other PDF web viewers which must download and initialize their entire PDF rendering package.

However, because PDF.js relies on the browser, results may vary according to the browser, as each may handle specific fonts, images, and graphics differently. PDF.js also relies on HTML “extensions” to support some missing graphics features — with uneven adoption across different browsers.

We’ve found that PDF.js performs best on Mozilla Firefox (naturally) less well on Safari and Chrome, and weakest on Microsoft Edge and Internet Explorer. Many older browsers are not supported.

PDF.js Supported Browsers

Browser — Supported — Automated Testing

Firefox — Yes — Windows/Linux

Chrome — Yes — Windows/Linux

Opera — Yes — None

IE 11/Edge — Mostly — None

Safari 9+ — Mostly — None

IE 10 and below — No — None

Safari 8 and below — No — None

Android 4 and below — No — None

Data from the PDF.js FAQ.

Therefore, PDF.js will prove satisfactory if your users work on the latest versions of Chrome and Firefox, and less satisfactory should your users work in IE 11 or older browsers, or on older mobile devices.

Supported File Types

Copied to clipboard

PDF.js only opens PDF files. Any other format, such as MS Office documents, .txt, and images, will all have to be converted to PDF using another tool.

Security

Copied to clipboard

PDF.js is reputed as a secure, sandboxed environment (within the iframe) as well as a suitable replacement for old-fashioned web PDF readers that relied upon security-challenged plugin technology.

PDF.js has had a few high-profile exploits over the years. Mozilla patched each one very quickly, as they do for any threat to Firefox. For example:

  • October 2013: Mozilla issued a “high” impact security advisory for a bypass of PDF.js checks using the iframe. This exploit could have been used to gather info about local files via no more than normal browsing actions.
  • August 2015: Mozilla issued a “critical” advisory, the highest such advisory level, after an ad on a Russian news site was discovered to exploit a same-origin policy violation to try and steal local files.
  • May 2018: Mozilla fixed two Firefox security vulnerabilities involving PDF.js. Each of the two issues warranted a “high” impact advisory.

As with all open-source projects, especially popular ones, there’s always a chance of vulnerabilities that expose you or your users to attacks. At this point, however, PDF.js does not seem to possess any more or less vulnerabilities than other open-source projects.

Text Select, Text Search, and Text Extract

Copied to clipboard

PDF.js text select, text search, and copy/paste features rely on the underlying text parsing and extraction engine, which defines the text overlay and relies on the browser’s built-in text features. PDF.js text select has 90+ open issues on GitHub today — more than any other issue category.

Normal PDF.js text selection can therefore prove unreliable. Some PDFs, for example, do not include correct text bounding boxes, and PDF.js is unable to correct for this. You may thus encounter documents where PDF.js selection jumps and misses sections or where spaces go missing when text is copy-pasted. Other times, double spaces are inserted.

Text selection issues in pdf.js viewer

These text issues can be fixed only by modifying the underlying text parsing and rendering engine, and they contribute to unreliable PDF.js text search as well. Indeed, PDF.js search may miss words and phrases, especially when these span multiple lines or where text includes extra white spaces between words.

Due to the unreliable text extraction engine, enterprises may find it challenging to be compliant with any accessibility standard, like 508/ADA. Furthermore:

Quote

PDF.js is not accessible out of the box because it does not carry over the document’s tagged markup to the DOM; PDF.js only uses <spans> and <divs> to render text.”
– Developer LinkedIn

You may have to rebuild the text overlay, as Linkedin did for Linkedin Learning. This requires understanding the PDF.js code base, itself complex and assuming familiarity with the PDF specification.

Quote

PDFs are an incredibly complex file format; this is especially so given that a PDF can be generated a hundred different ways, all of which a renderer needs to handle gracefully. We dug deep into the Adobe manual for PDF specifications and engineered our way to surfacing tagged documents as semantic HTML in the DOM.
– Developer LinkedIn

Lastly, PDF.js text search supports basic search features, such as highlighting searched words and matching for case — sufficient for most users. However, PDF.js lacks several advanced search features. For example, PDF.js does not yet support searching for multiple terms/phrases at the same time, and adding these features yourself may prove challenging.

Therefore, if accurate text search, text select, and/or text extraction are important to your use-case, or if you wish to implement advanced search, then PDF.js may not be the best.

Printing with PDF.js

Copied to clipboard

PDF.js may be unsuited for the purpose of accurate, high quality printing. There are currently 35 open issues on GitHub related to printing. Problems commonly arise from the core PDF rendering engine — as noted by an important project contributor:

Core community contributor comment on PDF.js printing

Since the engine has to render canvases at a smaller size for viewing and printing, the result may be blurry output that can make it difficult to read hard-copy text. Moreover, our customers have reported that the experimental SVG backend has several rendering issues that may cause incorrect printing:

Quote

The problem with the [PDF-to-SVG] was that it simply had too many errors and the students started complaining about error-prone graphs, formula errors, and other weird behaviors.
– Developer, e-learning platform

Additionally, color fidelity cannot be assured in printed material with either the SVG or canvas backends. PDF.js does not support color management functionality such as ink/color separations, CMYK colors, and ICC color profiles (an open issue for six years).

In summary, if your users share PDFs as part of a pre-press review process, or if they otherwise require clear and accurate printing, then you may wish to consider a commercial PDF SDK.

Adding More Features

Copied to clipboard

For some customers, basic PDF rendering and viewing may be all they need, and PDF.js will prove excellent for this use case.

Others, however, may want to add further capabilities such as annotations, form filling, watermarking, merging documents, signatures, and redaction. Adding any of these features may prove difficult, as some of our customers testify:

Quote

Currently I'm evaluating possible solutions to replace PDF.js in a DMS application. We would like to move away from PDF.js because it’s limited in its functionality and we need some advanced stuff like annotations which can’t be easily done with it.
– Senior Developer, DMS software

Quote

PDF.js is great for getting a proof of concept out there. It does 95% of the things we want, but that 5% is crucial to us.”
– Developer, legal software

We surveyed 57 unique organizations who came to us recently seeking a commercial PDF SDK after trying PDF.js and discovering it would not meet their needs.

Notably, 42 or 73.7% of respondents cited a need for more functionality as a reason for seeking a commercial solution. Of those 42 organizations, almost three-quarters (71.4%) tried to implement that functionality themselves first and found it too difficult or time-intensive. The other 29.6% are “unknown.” (They may have tried, or they may not have.)

Ultimately, PDF.js may not prove time-efficient when you want to add features such as annotations. In some cases, such as where you want to add editing, you way wish to consider an alternative.

Maintaining Custom Features

Copied to clipboard

Another important consideration is that once custom PDF.js features are built, they will have to be supported and maintained.

With over 6,000 forks and almost 27,000 GitHub “stars,” PDF.js is still popular with the open-source community. Commits happen on average several times a week, and these changes are not necessarily performed with your project in mind:

PDF.js support issue

Our customers have also told us of PDF.js patches that led to undesired rendering behavior or that removed certain features, breaking their customizations. Customers had to dedicate additional staff to monitoring and testing changes. This made it harder to implement changes later on and reduced their capacity to build additional features.

(For a deeper dive, see our article on attempted PDF.js projects, including our customer experiences.)

Support

Copied to clipboard

Overall, PDF.js does a really good job with support; some contributors are very active, and response times can be lightning-fast, with one- or two-day responses in many cases, particularly for simple issues.

That being said, issues related to features, reliability, performance, and accuracy see longer response times.

For example, the PDF.js forum currently has around 70 open feature requests, including often-requested features such as Interactive & fillable forms, access to OCG layers, pinch zoom for mobile, and digital signatures.

These open feature issues have an average age of five years, with 84% created before the end of 2016. And most lack a clear resolution timeline — a source of some dev frustration.

We’ve also measured a consistent increase in the total number of unresolved PDF.js support issues:

PDF.js open issues over time

Additionally, we found a gradual slow-down in the issue resolution rate, where we looked at the month each issue was created and measured how many of those issues from each month have since been closed.

PDF.js issue resolution rate over time

Because PDF.js is licensed under Apache 2.0, there is also zero liability or warranty for any defects. Rendering errors, for example, are ultimately your responsibility. And since PDF.js is an open source project, the level of support you will receive cannot be compared fairly to that of a commercial solution.

Project Status and Trajectory

Copied to clipboard

Based on the velocity of PDF.js feature updates, going forward, it is an open question what the future of PDF.js holds.

Several signs point to Mozilla having lost interest in PDF.js and moving on. Their contributions seem to have diminished over time, as both of PDF.js’s primary proponents at Mozilla (Andreas Gal and Chris Jones) have since left. Meanwhile, PDF.js’s primary host, Mozilla Labs, closed down in 2014.

In 2016, Mozilla even considered replacing PDF.js as its built-in Firefox viewer in part to reduce the support burden. It later dropped the projectciting implementation and maintenance costs.

Mozilla devs have also been reasonably candid about the fact that they may see PDF.js as a distraction from developing important web technologies like CSS, HTML, and JavaScript. It was said in a 2016 Mozilla dev planning thread:

Quote

...PDFs are not a fundamental part of the web. Therefore, Mozilla would like to not have to spend a lot of effort supporting them. ...despite a lot of effort [PDF.js] still has two significant shortcomings: it is a little on the slow side, and it doesn't support all PDF functionality, including form-filling. Fixing those shortcomings would be a lot of work. (PDF is a *huge* format.)
– Developer, Mozilla

If you are looking for a stable, long-term solution, PDF.js poses some uncertainty.

Update: Mozilla added Form Filling functionality in Oct 2021.

Performance, Reliability, and Accuracy

Copied to clipboard

Performance

PDF.js is proven to be excellent for viewing small and simple PDFs such as many sales reports, invoices, and contracts. However, its performance is often less than optimal with more massive and complicated PDFs, such as many construction and engineering drawings, maps, large textbooks, and other designs.

Quote

We are developing a document viewer app that provides a secure container and syncs the documents for offline reading. We evaluated PDF.js, but the UX was not the best.
– Senior UX Consultant, Fortune 50 software company

We’ve documented many of these performance issues as well, especially on mobile. For instance, PDF.js may have difficulty with documents above 100 megabytes:

Quote

We also tried PDF.js to render pdf using a blob object. It is working on iPad and iPhones with a few limitations like it is not able to open PDFs bigger than 100MB, and it doesn’tsupport pinch zoom.
– Developer, Fortune 50 company

The SVG backend may also lead to slow performance in some cases:

Quote

...we have a custom PDF viewer, which decrypts the PDFs on the client side and renders them as SVGs using Mozilla’s PDF.js library. But the library is slow, inefficient, and requires the client to handle the rendering.
– Developer, eLearning software

Customers have also complained about slow performance in general:

Quote

Customers are complaining about performance (mainly time to first page render). We want to have the same experience across all platforms for our two main use cases...
– Solution Architect, life sciences software

Quote

While the document viewer works well and provides zoom, pan, annotation, outline and thumbnail navigation, it is slow since it requires the entire document to be downloaded before it can be viewed. We are looking for something better.
– Technical Director, document management software

Reliability

We’ve also found that about 1-3% of geospatial, life sciences, and CAD-based PDFs will crash or freeze the browser with PDF.js. Some subsets of CAD-based PDFs, such as those we fetched from the open-source repository GrabCAD.com, crashed the browser or failed to open in PDF.js at a rate of 10% of documents.

Some of our customers have also complained about PDF.js reliability issues:

Quote

We are using PDF.js now as an embedded viewer for PDF documents in a single page application, and we are having some issues with crashing browsers and suspect issues with the viewer.
– CTO, training & compliance software

Quote

At present, we’re working with open-source PDF.js which is great for the 95% of PDFs, but the other 5% is critical. Larger PDFs are tricky.
– Co-founder, eDiscovery software

Of course, you could optimize and shrink documents for PDF.js to manage both performance and reliability issues. This assumes, however, that you can control documents before display. If you are unable to control documents, then you may be unable to control the user experience.

Accuracy

For 99% of PDFs, particularly simple files such as PDF invoices and sales reports, PDF.js will render content accurately. However, within the subset of more demanding documents in enterprise and organizational workflows, you may encounter difficulties.

For example, PDF.js has a few rendering inconsistencies with these documents, reported by our customers:

Quote

We are currently using [PDF.js] to view construction plans related to a project being bid on. We have a small percentage of plans that don’t render correctly. In these cases we have a work around for the user to download the plan to Acrobat.
– VP, software consulting firm

Quote

We have about 1000 paid users now. PDF.js has some problems: 1) Some weird formatting, such as with really old PDFs in a school database. 2) When the PDF is huge or full of images, for example, textbooks, it loads really slow. Also, it consumes a lot of RAM.
– Developer, eLearning software

We captured many instances of these rendering issues four years ago. The PDF.js contributor community has since fixed several rendering issues, including a few we pointed out.

However, PDF.js still faces rendering problems, in part, because PDF.js still does not implement the full PDF spec, including support for specific PDF patterns, transparencies, and other advanced rendering features. Issues also occur as PDF.js relies heavily on the local device and the browser for rendering, as well as on custom HTML extensions to patch missing features — with uneven adoption across browsers other than Firefox. This leads to inconsistent behavior across platforms.

Some of our customers also report image quality problems, including blurriness at zoom factors of 100% or more, especially on intricate designs and maps.

Quote

Our drawback with PDF.js is the loss of quality on some large plans on 100% zoom level and beyond. This loss of quality can sometimes block the user’s ability to make correct measurements in the file.
– Developer, 3D mapping software

Drawing courtesy of ELEMENTAL (some rights reserved). Download it here if you want to try for yourself in the PDF.js demo viewer or another viewer.

These image quality issues can make it very difficult to read small text and perform accurate measurements.

In summary, for viewing small and simple PDFs on modern browsers, PDF.js is excellent.

However, if you require consistently high performance and reliability with complex documents, and near flawless rendering overall; or if your users work within older browsers — then you may want to consider a commercial PDF SDK.

(Learn more about PDF.js rendering accuracy and possible implications for your project in our detailed guide.)

Why Organizations Switch from PDF.js

Copied to clipboard

We surveyed 57 unique organizations who came to us after finding PDF.js could not meet their needs. Many of these organizations consisted of OEMs and enterprises, working within design agency settings and in industries such as construction and engineering, publishing, finance, education, legal, and life sciences.

Reasons why pdf.js was no longer good enough

Conclusion: When to Use PDF.js

Copied to clipboard

A PDF.js-based project will tend to go smoothly under the following circumstances:

  • Your feature requirements are basic
  • Your PDFs are small and simple (e.g., invoices, sales reports, etc.)
  • The project is a short-term solution
  • Users are willing to tolerate some rendering, reliability, and performance issues
  • You have control over documents before viewing. (Users are not opening arbitrary files.)

In contrast, the following may make your project much more complicated or require you to consider a commercial solution:

  • The web viewer will be heavily relied upon in an organizational setting or commercial product
  • Feature requirements are more advanced and may include editing
  • UX needs to be a competitive differentiator
  • Users demand rendering accuracy and precision
  • Documents include the large and complex
  • You cannot optimize documents for PDF.js before viewing

Additionally, we also wanted to give you an overview of supported PDF.js capabilities to help you decide whether PDF.js is the right match for your unique document and user requirements.

In what follows, checkmarked features and capabilities are available with PDF.js out of the box, whereas other features are not yet part of the main PDF.js project. Some of our customers have tried building several of these features themselves or by using other projects — with varying levels of success.

PDF.js Supported and Unsupported Features

Copied to clipboard

General

Edit PDFs ✖
Open Word documents, images, and other file types ✖
Support (including SLAs and guaranteed response times) ✖
Warranty and liability (e.g., for rendering errors) ✖
Accurate, high-quality printing ✖

Browser Support

Firefox and Chrome ✓
Microsoft Edge and IE 11 partial*
Older browsers such as IE 10 ✖
Older mobile devices (e.g., iOS 9) ✖

Text Features

Basic text search, text select, and copy/paste ✓
Text select partial*
Text search partial*
Text extraction partial*
Advanced search ✖
Accessibility compliance (e.g., 508/ADA) ✖

Additional Features

Annotations overlay ✖
Direct annotation (add, edit, remove) ✖
Real-time collab via user comments and replies ✖
Signatures Display
Watermarks ✖
Redaction ✖
Measurement ✖
Generate PDFs ✖
Page manipulation (add, merge, remove) ✖
Document manipulation (merge, split, etc.) ✖

Forms

Render visual appearance ✓
Form field extraction ✖
Form fill ✓
Interactive elements (e.g., buttons) ✓
Form actions ✖
Form generation ✖

UI and Mobile Features

Customizable UI ✓
Dark theme ✓
Multiple page views and layouts ✓
Page rotation ✓
Crop page ✖
Zoom ✓ (max 1000%)
Responsive UI ✖
Pinch zoom ✖
Night mode API method

Advanced Graphics and Features

Color management ✖
Soft masks partial*
Gradients and patterns partial*
Knockout groups partial*
Overprint simulation ✖
Accurate measurement on complex documents ✖
Toggleable visual layers (via OCGs) API Method
File comparison ✖
Geospatial PDF ✖
3D PDF ✖
Video ✖

*Supported to some degree, but may prove unreliable or inaccurate

The Bottom Line

Copied to clipboard

Building with PDF.js can certainly be cost-effective within the right project scope: primarily, when one wishes to enable viewing of small and simple PDFs.

If you decide PDF.js is the right tool for you, we also provide a few guides to help start your team on PDF.js-based projects that work pretty great:

We’d love to hear any feedback you may have about this article or our PDF SDK. Don’t hesitate to contact us directly.

Sanity Image

Adam Pez

Share this post

email
linkedIn
twitter