If your users share documents in a high-stakes, fast-paced environment, they may need their PDFs to open quickly and flawlessly. PDF.js, Mozilla’s open-source JavaScript library, offers one way for them to open and view their PDF files in a website.

However, we recently surveyed 57 unique organizations that tried PDF.js and later decided to look for an alternative. And 15.8% cited failure to open files or browser crashes as a reason for switching from PDF.js.

To help you avoid making the same mistakes, we wanted to find out: what types of PDFs does PDF.js crash on?

Our research involved opening 1,663 PDF files in PDF.js. These documents included random PDFs from Google, as well as business documents, financial statements, construction drawings, college textbooks, and more.

What we found is that PDF.js will open 98.6% of PDFs found in the wild. However, some types of PDFs crashed or froze the browser more than others. While simple PDFs, like invoices, performed well enough, graphics-heavy documents tended to have higher failure rates.

PDF.js Document Open Rate

(For more information on other topics from performance to supported functionality, read our comprehensive guide to PDF.js.)

linkBackground

“PDFs are an incredibly complex file format; this is especially so given that a PDF can be generated a hundred different ways, all of which a renderer needs to handle gracefully.”

Developer, Linkedin

PDFs found in the wild come in all different sizes and compositions, from small and simple invoices -- to massive reports and intricate designs shared in workflows across government and enterprise settings.

While simple PDFs may use few PDF features, more complex documents may make full use of the PDF specification to embed images with different compression types, optional content groups, transparencies, gradients, patterns, and more. Not all PDFs are equal, and therefore, they won’t all behave the same when opened in a web viewer such as PDF.js.

“PDF is an incredibly complex file format—the specification is more than a thousand pages long, not including the extensions and supplements.”

~Senior Developer, Dropbox

Another consideration is that your risk tolerance for documents crashing or freezing the browser will vary with the requirements of your users. For example, our AEC customers tell us that a PDF viewer needs to be 100% reliable as their users are quick to reject anything less.

“If at the 11th hour your takeoff system is not reading that last file, even if you rendered the last 10 out of 11 files perfectly you can’t get you estimate. That person can’t do their job. Effectively, you could lose business by not being able to open one of these PDFs… It’s like getting it 99% right and 1% wrong -- and you actually fail.”

Tony Cornwall, Construction Computer Software

“As soon as your tool is seen as not 100% reliable, even if it’s still 99% reliable, the customer is going to switch off and default to the next-lowest common denominator -- the Adobe Acrobats or pen and paper.”

~CEO, AEC Project Collaboration Software

linkWhat our Customers Said

Our customers reported reliability issues with PDF.js that caused them to seek an alternative.

“We also tried PDF.js to render pdf using a blob object. It is working on iPad and iPhones with a few limitations like it is not able to open PDFs bigger than 100MB, and it doesn’t support pinch zoom.”

~Developer, Fortune 50 Company

“We are using PDF.js now as an embedded viewer for PDF documents in a single page application, and we are having some issues with crashing browsers and suspect issues with the viewer.”

~CTO, Training & Compliance Software

“At present, we’re working with open-source PDF.js which is great for the 95% of PDFs, but the other 5% is critical. Larger PDFs are tricky.”

~Co-founder, eDiscovery Software

linkEvaluating PDF.js Reliability

To understand these issues better, we opened 1,663 PDFs using Chrome 76 on a new laptop and the latest version of the PDF.js demo viewer (v2.3.146).

These PDFs included:

  • AEC drawings submitted by developers as part of the project permitting process for the City of Vancouver.
  • Geospatial PDFs downloaded from the US Geological Survey.
  • Fortune 100 text-based filings for the Securities Exchange Commission available on SEDAR.
  • Government Forms including court forms, police forms, and tax documents from the UK, Canada, and the US.
  • Magazines downloaded from freemagazinepdf.com
  • Scientific documents from the open-science repository Zenodo.org
  • Model Renderings including CAD-based PDFs from Grabcad.com
  • College Textbooks from a variety of websites.
  • 200+ random PDFs taken from Google via a “filetype: PDF” search.

Note: Even though PDF.js may open documents, it may not render content quickly or accurately. For this benchmark, we only looked at whether documents would crash or hang the browser.

To learn more, check out our detailed guide to PDF.js rendering accuracy.

linkThe Results:

PDF.js Open Rate by Document Type

Documents such as text-based financial filings, government forms, e-magazines, textbooks, and scientific reports opened in PDF.js without any apparent difficulty.

Other documents did not perform as well, particularly graphics-heavy documents.

For example, Architecture, Construction, and Engineering drawings showed a 1% failure rate, while PDFs from Grabcad.com performed the worst, with as many as 1 in 10 (10%) failing to open or crashing the browser. These were PDFs generated from models using a variety of different CAD applications.

Random PDFs found on Google also had a failure rate of 1%. These findings are consistent with an older yet similar PDF.js benchmark, published on Mozilla Hacks .

This study looked at about 7,000 PDFs taken from Google and found 0.8% (roughly 1/100) would crash the browser with PDF.js. It also noted 2.8% of documents produced a “less-than-optimal” UX and that PDF.js had difficulty with graphics-heavy documents.

linkWhat Failure Looks like

Documents that crashed PDF.js would do so in a couple common ways:

First were corrupted documents. PDF.js would throw an exception and close them right away:

PDF.js Invalid or Corrupted File

Many other documents, however, crashed due to memory issues. PDF.js simply could not allocate memory efficiently enough, especially for graphics-heavy PDFs.

As a result, the browser would throw an exception after trying to load the file:

PDF.js Chrome Exception -- Aw, Snap!

Other times, PDF.js would open the file -- only to hang indefinitely when rendering the page:

PDF.js Hanging Indefinitely

linkExplaining Memory Issues

As illustrated on the PDF.js GitHub and elsewhere, PDF.js may not allocate memory efficiently, especially on certain browsers, such as when it needs to render a large embedded jpeg, or when rendering an especially large and complicated page.

There are a couple reasons why this may happen:

linkLack of Support for Canvas Tiling

First, large canvases are essentially huge bitmaps and thus consume lots of memory. This is especially true when one interacts with (zooms into, pans, and scrolls across) a document, and PDF.js is forced to re-render complicated canvases at a larger size and higher resolution.

The problem is that large canvases take much memory space. It's visible if a PDF page is large (e.g. map) or zoomed in (e.g. at 800%+ zoom). Currently we are limiting canvas size for mobile device. However a proper solution will be to divide pag into smaller canvases and render only visible parts.

Due to lack of support for canvas tiling that would break up rendering into smaller manageable pieces, PDF.js renders page content all at once onto a single large canvas image, which in some cases, may be larger than what the browser permits or consumes too much memory.

PDF.js therefore struggles to handle larger design documents, maps, and blueprints, especially on mobile browsers where memory constraints are tightest.

linkLack of Support for OCG Layers

Another issue pertains to large PDFs with many layers, such as the Geospatial PDFs with a 3% failure rate.

Geospatial PDFs, for example, may include a street or topographic vector layer over top a satellite imagery raster background. The latter is switched off by default to ensure readability and performance.

But since PDF.js does not support OCG layers, it will render every layer -- even layers switched off by default.

PDF.js RenderingCorrect Rendering
PDF.js Incorrect Rendering: OCG LayersCorrect Rendering: OCG Layers

And with an especially big and complex map, it can quickly hit a wall.

linkHow you can test your documents

We encourage customers to test their own PDFs in a web viewer before making a decision, as your experience will vary considerably with your documents and across different browsers, including mobile.

First, grab a selection of representative files. (Our customer Construction Computer Software gathered over 150 demanding AEC drawings from their users which they later used to evaluate different viewers, including PDFTron SDK.)

You’ll want to see whether your files open in the PDF.js demo viewer on the browsers and devices you expect your users will prefer. You’ll also want to interact with these documents to test whether PDF.js will deliver the desired UX.

Try scrolling and panning across a document, and zooming into and out of areas where you expect users will want to read small text or perform measurements.

If after 20-30 seconds of heavy interaction, performance is still relatively smooth and the browser hasn’t crashed, then PDF.js may work for your PDFs.

However, if your browser hangs or crashes, or if the UX degrades considerably -- you may wish to consider alternatives.

linkNext Steps

Once you’ve run your tests and if you haven’t experienced any problems, you could try PDF.js Express -- a solution that streamlines the implementation of basic annotations, form-filling, and signatures within a highly customizable and commercially backed open-source UI.

If you encountered issues that may be of concern, however, you could consider a more robust commercial solution, like PDFTron WebViewer.

We always appreciate feedback on our blog. If you have any questions, don’t hesitate to contact us directly.