Some test text!


Java OCR Library

Starting with PDFNet SDK 7.0, PDFTron offers OCR Module as a new optional add-on utility currently available on Windows and Linux. It can be used in conjunction with the SDK to create searchable and selectable text from images. The OCR engine used is an open source LSTM neural network from Tesseract 4 and supports 100+ languages offered by Tesseract distribution.

The module takes advantage of pdftron.PDF.Convert.ToPdf and accepts multiple image formats, as well as PDFs wrapping raster images.

Quality of output depends on image supplied. The ideal image is greyscale with resolution in the vicinity of 300 DPI.

linkGet started

OCR workflow
In this section, we showcase the potential OCR workflow.

Get the answers you need: Support


Free Trial

Get unlimited trial usage of PDFTron SDK to bring accurate, reliable, and fast document processing capabilities to any application or workflow.

Select a platform to get started with your free trial.

Unlimited usage. No email address required.

Join our live demo to learn about use cases & capabilities for WebViewer

Learn more