Some test text!

Jul 13 2022

New OCR Engine and Document Template Generation APIs, ARM64 Support, and More in PDFTron SDK 9.3

by Andrey Safonov

Core 9 3 release blog banner

We are excited to share with you another major update to our Core SDK. Now available for

, the PDFTron SDK v9.3 comes with several upgrades, including a new OCR backend, improvements to the document template generation, support for next gen ARM64 applications, and more.

Read on to learn all about our most significant 9.3 improvements and how they can help you to innovate leading document experiences and intelligent processing solutions. For the complete rundown of smaller enhancements, check out our v9.3 changelogs for
Windows
,
Linux
, and
macOS
.

Introducing Our New OCR Backend

Optical Character Recognition, which turns a scanned image into a searchable and interactive format like PDF, is a must-have technology when it comes to digital transformation. OCR unlocks data locked in an image, making it available for extraction and repurposing without manual processes.

Our default OCR module is based on Tesseract. With the new, additional module powered by

, however, you’ll notice a significant improvement in text extraction accuracy, as well as many new features to enhance your OCR solution.

You can try out the new OCR engine by downloading the

and running the following
code sample
. For existing PDFTron customers, no code changes are required: all you have to do is copy the new module and paste it in the old OCR module folder structure. 

Improved Document Template Generation

Since our 9.0 core release, PDFTron has supported dynamic document generation from Office templates. This feature has powered productivity, as it simplifies creation of pixel-perfect, paginated documents from any DOCX, XLSX, or PPTX file using clicks. In a word processor, a user can create personalized business proposals, contracts, reports, and so on using curly {} brackets for placeholder fields to automate input of their custom data. 

PDFTron’s Office template generation is available:

  • client-side or in a browser if you are generating one-off reports or account statements on the fly 
  • server-side for batch processing thousands of documents, such as letters or tax slips 

In our 9.3 release, we wanted to take our template generation up one more notch in both usability and features. So we added advanced markdown or HTML styling, streamlined the APIs, and made many fine-tunings to our document generation engine to boost its performance and reduce computational costs for you.

Advanced Styling

Before now, replaced text inherited the styling of the placeholder, including font size and style. With release 9.3, your users can now input data into placeholder fields, using their favorite rich text editor, to take advantage of advanced styling capabilities. Placeholders now accept markdown and HTML. 

Supported Elements for Advanced Styling

Elements supported in this release include:

  • <p> paragraph </p>
  • <br> break
  • <a> link </a>
  • <b> bold </b>
  • <u> underline </u>
  • <i> italic </i>
  • <h1> headers </h1>
  • <ol> ordered lists </ol>
  • <ul> unordered lists </ul>

You can also use CSS properties to style some of the above elements. We focused on the most popular elements first, and more will be available soon. 

Preparing the JSON Object

When preparing the JSON object, specify that the data is in HTML, for example:

"key": {
	"html": "
 <h1>
    Hello  <i>World </i> from  <span style='color: #00a5e4'>PDFTron </span>!
 </h1>
 <br>
 <p>
New template features in core release:
 <ol style='list-style: lower-alpha'>
     <li>
         <b>Structured input: </b> <html> and markdown.
     </li>
     <li>
         <b>Template API update </b>, including new  <span style='font-family: monospace'> <a href='https://www.pdftron.com/api/web/Core.html#.TemplateSchema'>GetTemplateKeys() </a> </span> function.
     </li>
 </ol>
 </p>
"
}

This is what template data looks like in the Template Generation Demo:

Screenshot of document template generation feature in action.

Simplified Doc Gen APIs and Enhanced Performance

The v9.3 release introduces a simplified API that streamlines the entire template generation process. The new API called `CreateOfficeTemplate` allows you to load and read an Office template once and generate multiple PDFs with different data by reusing the `TemplateDocument` object multiple times. The original API call would need to re-read the Office template multiple times for different data. In contrast, depending on the complexity of the document, the new method may enhance performance by 5-20% and reduces your computational cost.

Also new is the `GetTemplateKeysJson` API, which reveals any keys used in a DOCX template, and whether they are used in normal text tags, loop tags, or conditional tags.

Support for ARM64

PDFTron Core SDK now supports ARM64, which lets you run and execute 64-bit applications on ARM64-based processors.  

For example, you might be using Amazon Elastic Container Service (Amazon ECS) powered by AWS Graviton 2 and 3 Processors. If you run your document processing application on this new AWS architecture, Amazon promises that you'll see a 40% price performance improvement.

Next Steps

We hope you’re as excited as we are about the new and updated features and APIs in v9.3. For the complete list of changes, check out the 9.3 changelogs for

,
Linux
and
macOS
.

If you have any questions or feedback about this release or ideas for the next release, please feel free to

me.

Related articles

thumbnail

New OCR Engine and Document Template Generation APIs, ARM64 Support, and More in PDFTron SDK 9.3

PDFTron SDK 9.3 adds a new OCR engine, support for ARM64 processing architecture, and new APIs for Document Template Generation. For Windows, Linux, and MacOS applications.

thumbnail

Introducing Typed Signatures, Free Text Alignment, and More for PDFTron iOS SDK

PDFTron iOS SDK versions 9.2.2. and 9.2.3 adds Typed Signatures, Free Text Alignment, and More!

thumbnail

New Customization APIs for WebViewer UI - Part 1 with WebViewer 8.6

WebViewer 8.6 introduces new customization APIs to give developers an even better experience when tailoring the look and feel of their WebViewer UX.

ANDREY SAFONOV

Head of Product

First a developer then a solution engineer. Now a product team leader and dev experience advocate.

Related Products

Share this post

Upcoming Webinar: SDK Features Preview and Live Run-Through | July 14, 2022 at 11 am PT

PDFTron SDK

The Platform

NEW

© 2022 PDFTron Systems Inc. All rights reserved.

Privacy

Terms of Use