Class: WordOutputOptions

PDFNet.Convert. WordOutputOptions


new WordOutputOptions()

A class containing options common to ToWord functions

Members


<static> BookmarkConversionMethod

Type:
  • number
Properties:
Name Type Description
e_bm_none number Indicates that no bookmarks are created.
e_bm_page number Indicates that a bookmark is created for each page (default).
e_bm_extract number Indicates that bookmarks are converted from PDF to Word.

<static> SearchableImageSetting

Type:
  • number
Properties:
Name Type Description
e_ocr_image_text number Deprecated. OCR will be performed.
e_ocr_image number Deprecated. OCR will not be performed.
e_ocr_text number Indicates that OCR will be performed and the recognized text replaces the image pixels underneath (default).
e_ocr_off number Indicates that OCR will not be performed.

<static> WordOutputFormat

Type:
  • number
Properties:
Name Type Description
e_wof_docx number
e_wof_doc number Deprecated
e_wof_rtf number
e_wof_txt number

Methods


setBookmarkConversionMethod(method)

Specifies if and how PDF bookmarks should be converted into Word. Default is e_bm_extract. Deprecated. PDF bookmarks are now automatically converted to Word bookmarks.
Parameters:
Name Type Description
method number
PDFNet.Convert.WordOutputOptions.BookmarkConversionMethod = {
	e_bm_none: 0,
	e_bm_page: 1,
	e_bm_extract: 2
}
the bookmark conversion method.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setConnectHyphens(connect)

Specifies whether hyphens in the PDF should be connected. This only works with English words. Default is false.
Parameters:
Name Type Description
connect boolean if true, hyphens in the PDF will be connected.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setDisableVerticalSplit(disable)

Specifies whether to disable the detection of section columns. Default is false. Enable this if your tables are coming out as section columns. Deprecated. Columns are now detected automatically.
Parameters:
Name Type Description
disable boolean if true, the detection of section columns are disabled.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setDoNotAdjustFonts(do_not_adjust)

Specifies whether to disable font adjustments during conversion. Default is false. Deprecated. Font sizes are now detected automatically.
Parameters:
Name Type Description
do_not_adjust boolean if true, font adjustments are disabled during conversion.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setFileConversionTimeoutSeconds(seconds)

Specifies the amount of time in seconds after which the conversion fails. Default is 300. Very long files need more time to convert. Deprecated. The timeout feature is no longer necessary.
Parameters:
Name Type Description
seconds number the timeout in seconds.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setImageDPI(dpi)

Specifies the output image resolution, from 8 to 600, in Pixels Per Inch (PPI). The higher the PPI, the larger the image. Default 192. Deprecated. The optimal image resolution is now chosen automatically for best balance between size and quality.
Parameters:
Name Type Description
dpi number the resolution in Pixels Per Inch.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setJPGQuality(quality)

Specifies the compression quality to use when generating JPEG images. Deprecated. The optimal JPEG quality is now chosen automatically for best balance between size and quality.
Parameters:
Name Type Description
quality number the JPEG compression quality, from 0 (highest compression) to 100 (best quality). Default is 75.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setLanguage(language)

Specifies the OCR language. Default is automatic language detection. Note: This option is only available for e_reflow_paragraphs mode.
Parameters:
Name Type Description
language number
PDFNet.Convert.OutputOptionsOCR.LanguageChoice = {
	e_lang_auto: 0,
	e_lang_catalan: 1,
	e_lang_danish: 2,
	e_lang_german: 3,
	e_lang_english: 4,
	e_lang_spanish: 5,
	e_lang_finnish: 6,
	e_lang_french: 7,
	e_lang_italian: 8,
	e_lang_dutch: 9,
	e_lang_norwegian: 10,
	e_lang_portuguese: 11,
	e_lang_polish: 12,
	e_lang_romanian: 13,
	e_lang_russian: 14,
	e_lang_slovenian: 15,
	e_lang_swedish: 16,
	e_lang_turkish: 17
}
the OCR language.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setMatchPDFLineBreaks(match)

Specifies whether PDF line breaks should come out as line breaks in the Word output. This causes each line of text to become a separate paragraph. Default is false. Deprecated. Line breaks are now detected automatically.
Parameters:
Name Type Description
match boolean if true, line breaks will come out as line breaks in the Word output.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setPages(page_from, page_to)

Specifies a range of pages to be converted. By default all pages are converted. The first page has the page number of 1.
Parameters:
Name Type Description
page_from number the first page to be converted.
page_to number the last page to be converted (inclusive). Use a negative value to specify the last page in the PDF.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setPDFPassword(password)

Specifies the password if the PDF requires one.
Parameters:
Name Type Description
password string the PDF password, if required; an empty string otherwise.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setPrioritizeVisualAppearance(replica)

Specifies whether to prefer an exact visual replica of the PDF at the expense of preventing reflow of document paragraphs. Default is false.
Parameters:
Name Type Description
replica boolean False is preferred for most documents that contain paragraphs. Consider using true for documents that don't flow, such as CAD drawings, Illustrator-generated files.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setSearchableImageSetting(setting)

Specifies how scanned image pages should be converted. Default is e_ocr_text.
Parameters:
Name Type Description
setting number
PDFNet.Convert.WordOutputOptions.SearchableImageSetting = {
	e_ocr_text: 2,
	e_ocr_off: 3
}
the searchable image setting.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setShrinkCharacterSpacingToPreventWrap(shrink)

Specifies whether to shrink character spaces in order to prevent word wraps. Default is true. Deprecated. Character spacing is now detected automatically.
Parameters:
Name Type Description
shrink boolean if true, character spaces are shrunk in order to prevent word wraps.
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions

setWordOutputFormat(format)

Specifies the output document format (DOCX, RTF, TXT). It is the most useful when the output file extension is not .docx, .rtf or .txt. Note: The DOC file format is now deprecated, DOCX is used automatically instead.
Parameters:
Name Type Description
format number
PDFNet.Convert.WordOutputOptions.WordOutputFormat = {
	e_wof_docx: 0,
	e_wof_rtf: 2,
	e_wof_txt: 3
}
the output document format (DOCX, RTF, TXT).
Returns:
this object, for call chaining
Type
PDFNet.Convert.WordOutputOptions