However, we recently surveyed 57 unique organizations that tried PDF.js and later decided to look for an alternative. And 15.8% cited failure to open files or browser crashes as a reason for switching from PDF.js.
To help you avoid making the same mistakes, we wanted to find out: what types of PDFs does PDF.js crash on?
Our research involved opening 1,663 PDF files in PDF.js. These documents included random PDFs from Google, as well as business documents, financial statements, construction drawings, college textbooks, and more.
What we found is that PDF.js will open 98.6% of PDFs found in the wild. However, some types of PDFs crashed or froze the browser more than others. While simple PDFs, like invoices, performed well enough, graphics-heavy documents tended to have higher failure rates.
(For more information on other topics from performance to supported functionality, read our comprehensive guide to PDF.js.)
PDFs found in the wild come in all different sizes and compositions, from small and simple invoices -- to massive reports and intricate designs shared in workflows across government and enterprise settings.
While simple PDFs may use few PDF features, more complex documents may make full use of the PDF specification to embed images with different compression types, optional content groups, transparencies, gradients, patterns, and more. Not all PDFs are equal, and therefore, they won’t all behave the same when opened in a web viewer such as PDF.js.
~Senior Developer, Dropbox
Another consideration is that your risk tolerance for documents crashing or freezing the browser will vary with the requirements of your users. For example, our AEC customers tell us that a PDF viewer needs to be 100% reliable as their users are quick to reject anything less.
Tony Cornwall, Construction Computer Software
~CEO, AEC Project Collaboration Software
linkWhat our Customers Said
Our customers reported reliability issues with PDF.js that caused them to seek an alternative.
~Developer, Fortune 50 Company
~CTO, Training & Compliance Software
~Co-founder, eDiscovery Software
linkEvaluating PDF.js Reliability
To understand these issues better, we opened 1,663 PDFs using Chrome 76 on a new laptop and the latest version of the PDF.js demo viewer (v2.3.146).
These PDFs included:
- AEC drawings submitted by developers as part of the project permitting process for the City of Vancouver.
- Geospatial PDFs downloaded from the US Geological Survey.
- Fortune 100 text-based filings for the Securities Exchange Commission available on SEDAR.
- Government Forms including court forms, police forms, and tax documents from the UK, Canada, and the US.
- Magazines downloaded from freemagazinepdf.com
- Scientific documents from the open-science repository Zenodo.org
- Model Renderings including CAD-based PDFs from Grabcad.com
- College Textbooks from a variety of websites.
- 200+ random PDFs taken from Google via a “filetype: PDF” search.
Note: Even though PDF.js may open documents, it may not render content quickly or accurately. For this benchmark, we only looked at whether documents would crash or hang the browser.
To learn more, check out our detailed guide to PDF.js rendering accuracy.
Documents such as text-based financial filings, government forms, e-magazines, textbooks, and scientific reports opened in PDF.js without any apparent difficulty.
Other documents did not perform as well, particularly graphics-heavy documents.
For example, Architecture, Construction, and Engineering drawings showed a 1% failure rate, while PDFs from Grabcad.com performed the worst, with as many as 1 in 10 (10%) failing to open or crashing the browser. These were PDFs generated from models using a variety of different CAD applications.
Random PDFs found on Google also had a failure rate of 1%. These findings are consistent with an older yet similar PDF.js benchmark, published on Mozilla Hacks .
This study looked at about 7,000 PDFs taken from Google and found 0.8% (roughly 1/100) would crash the browser with PDF.js. It also noted 2.8% of documents produced a “less-than-optimal” UX and that PDF.js had difficulty with graphics-heavy documents.
linkWhat Failure Looks like
Documents that crashed PDF.js would do so in a couple common ways:
First were corrupted documents. PDF.js would throw an exception and close them right away:
Many other documents, however, crashed due to memory issues. PDF.js simply could not allocate memory efficiently enough, especially for graphics-heavy PDFs.
As a result, the browser would throw an exception after trying to load the file:
Other times, PDF.js would open the file -- only to hang indefinitely when rendering the page:
linkExplaining Memory Issues
As illustrated on the PDF.js GitHub and elsewhere, PDF.js may not allocate memory efficiently, especially on certain browsers, such as when it needs to render a large embedded jpeg, or when rendering an especially large and complicated page.
There are a couple reasons why this may happen:
linkLack of Support for Canvas Tiling
First, large canvases are essentially huge bitmaps and thus consume lots of memory. This is especially true when one interacts with (zooms into, pans, and scrolls across) a document, and PDF.js is forced to re-render complicated canvases at a larger size and higher resolution.
Due to lack of support for canvas tiling that would break up rendering into smaller manageable pieces, PDF.js renders page content all at once onto a single large canvas image, which in some cases, may be larger than what the browser permits or consumes too much memory.
PDF.js therefore struggles to handle larger design documents, maps, and blueprints, especially on mobile browsers where memory constraints are tightest.
linkLack of Support for OCG Layers
Another issue pertains to large PDFs with many layers, such as the Geospatial PDFs with a 3% failure rate.
Geospatial PDFs, for example, may include a street or topographic vector layer over top a satellite imagery raster background. The latter is switched off by default to ensure readability and performance.
But since PDF.js does not support OCG layers, it will render every layer -- even layers switched off by default.
|PDF.js Rendering||Correct Rendering|
And with an especially big and complex map, it can quickly hit a wall.
linkHow you can test your documents
We encourage customers to test their own PDFs in a web viewer before making a decision, as your experience will vary considerably with your documents and across different browsers, including mobile.
First, grab a selection of representative files. (Our customer Construction Computer Software gathered over 150 demanding AEC drawings from their users which they later used to evaluate different viewers, including PDFTron SDK.)
You’ll want to see whether your files open in the PDF.js demo viewer on the browsers and devices you expect your users will prefer. You’ll also want to interact with these documents to test whether PDF.js will deliver the desired UX.
Try scrolling and panning across a document, and zooming into and out of areas where you expect users will want to read small text or perform measurements.
If after 20-30 seconds of heavy interaction, performance is still relatively smooth and the browser hasn’t crashed, then PDF.js may work for your PDFs.
However, if your browser hangs or crashes, or if the UX degrades considerably -- you may wish to consider alternatives.
Once you’ve run your tests and if you haven’t experienced any problems, you could try PDF.js Express -- a solution that streamlines the implementation of basic annotations, form-filling, and signatures within a highly customizable and commercially backed open-source UI.
If you encountered issues that may be of concern, however, you could consider a more robust commercial solution, like PDFTron WebViewer.
We always appreciate feedback on our blog. If you have any questions, don’t hesitate to contact us directly.