Summary: Ctors | Methods | Inherited Methods | [Expand All]

public class

OCROptions

extends OptionsBase

java.lang.Object
↳	com.pdftron.pdf.OptionsBase
	↳	com.pdftron.pdf.OCROptions

Summary

Public Constructors
	OCROptions() Constructor.
	OCROptions(String json_string) Constructor.

Public Methods
OCROptions	addDPI(int dpi) Knowing proper image resolution is important, as it enables the OCR engine to translate pixel heights of characters to their respective font sizes.
OCROptions	addIgnoreZonesForPage(RectCollection regions, int page_index) Adds a collection of ignorable regions for the given page Optional list of page areas that will be not be processed
OCROptions	addLang(String lang_code) Adds a language to the list of to be considered when procecessing this document
OCROptions	addTextZonesForPage(RectCollection regions, int page_index) Adds a collection of known text regions for the given page.
boolean	getAutoRotate() Gets the value AutoRotate from the options object Default value is false.
OCROptions	setAutoRotate(boolean value) Sets the value for AutoRotate in the options object Default value is false.
OCROptions	setIgnoreExistingText(boolean value) Sets the value for IgnoreExistingText in the options object Default value is false, so that areas with existing text will be automatically skipped during OCR.
OCROptions	setOCREngine(String value) Set the backend processing engine to use for OCR operations Options include 'default', 'any', or 'iris'.
OCROptions	setUsePDFPageCoords(boolean value) Sets the value for UsePDFPageCoords in the options object Sets origin of the coordinate system for input/output

[Expand]

Inherited Methods

From class java.lang.Object

Public Constructors

public OCROptions ()

Constructor.

Throws

PDFNetException

public OCROptions (String json_string)

Constructor.

Throws

PDFNetException

Public Methods

public OCROptions addDPI (int dpi)

Knowing proper image resolution is important, as it enables the OCR engine to translate pixel heights of characters to their respective font sizes. We do our best to retrieve resolution information from the input's metadata, however it occasionally can be corrupt or missing. Hence we allow manual override of source's resolution, which supersedes any metadata found (both explicit as in image metadata and implicit as in PDF).

Returns

this object, for call chaining

Throws

PDFNetException

public OCROptions addIgnoreZonesForPage (RectCollection regions, int page_index)

Adds a collection of ignorable regions for the given page Optional list of page areas that will be not be processed

Returns

this object, for call chaining

Throws

PDFNetException

public OCROptions addLang (String lang_code)

Adds a language to the list of to be considered when procecessing this document

Returns

this object, for call chaining

Throws

PDFNetException

public OCROptions addTextZonesForPage (RectCollection regions, int page_index)

Adds a collection of known text regions for the given page. This information will be used as a hint to improve OCR quality.

Returns

this object, for call chaining

Throws

PDFNetException

public boolean getAutoRotate ()

Gets the value AutoRotate from the options object Default value is false. Setting to true will deskew the image before conducting OCR.

Returns

a boolean, Default value is false. Setting to true will deskew the image before conducting OCR..

Throws

PDFNetException

public OCROptions setAutoRotate (boolean value)

Sets the value for AutoRotate in the options object Default value is false. Setting to true will deskew the image before conducting OCR.

Returns

this object, for call chaining

Throws

PDFNetException

public OCROptions setIgnoreExistingText (boolean value)

Sets the value for IgnoreExistingText in the options object Default value is false, so that areas with existing text will be automatically skipped during OCR. Setting to true probably only makes sense when used with GetOCRJson/XML, as pre-existing text might end up being duplicated in the document when used with ImageToPDF and ProcessPDF.

Returns

this object, for call chaining

Throws

PDFNetException

public OCROptions setOCREngine (String value)

Set the backend processing engine to use for OCR operations Options include 'default', 'any', or 'iris'. Chosen module must be present and correctly licensed.

Returns

this object, for call chaining

Throws

PDFNetException

public OCROptions setUsePDFPageCoords (boolean value)

Sets the value for UsePDFPageCoords in the options object Sets origin of the coordinate system for input/output

Returns

this object, for call chaining

Throws

PDFNetException

Interfaces

Classes

Enums

OCROptions

Summary

Public Constructors

public OCROptions ()

Throws

public OCROptions (String json_string)

Throws

Public Methods

public OCROptions addDPI (int dpi)

Returns

Throws

public OCROptions addIgnoreZonesForPage (RectCollection regions, int page_index)

Returns

Throws

public OCROptions addLang (String lang_code)

Returns

Throws

public OCROptions addTextZonesForPage (RectCollection regions, int page_index)

Returns

Throws

public boolean getAutoRotate ()

Returns

Throws

public OCROptions setAutoRotate (boolean value)

Returns

Throws

public OCROptions setIgnoreExistingText (boolean value)

Returns

Throws

public OCROptions setOCREngine (String value)

Returns

Throws

public OCROptions setUsePDFPageCoords (boolean value)

Returns

Throws