Some test text!

menu
search
Overviewkeyboard_arrow_down

OCR module

Starting with PDFNet SDK 7.0, PDFTron offers OCR Module as a new optional add-on utility currently available on Windows and Linux. It can be used in conjunction with the SDK to create searchable and selectable text from images. The OCR engine used is an open source LSTM neural network from Tesseract 4 and supports 100+ languages offered by Tesseract distribution.

The module takes advantage of pdftron.PDF.Convert.ToPdf and accepts multiple image formats, as well as PDFs wrapping raster images.

Quality of output depends on image supplied. The ideal image is greyscale with resolution in the vicinity of 300 DPI.

linkGet started

OCR workflow
In this section, we showcase the potential OCR workflow.

Get the answers you need: Support

close

Free Trial

Get unlimited trial usage of PDFTron SDK to bring accurate, reliable, and fast document processing capabilities to any application or workflow.

Select a platform to get started with your free trial.

Unlimited usage. No email address required.

PDFTron Receives USD$71 Million Growth Investment Led By Silversmith Capital Partners

Learn more
close