Class: TextExtractorWord

PDFNet. TextExtractorWord


new TextExtractorWord()

TextExtractor::Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor::Style).

Methods


<static> create()

Constructor

Returns:

A promise that resolves to an object of type: "textextractorword"

Type
PDFNet.TextExtractorWord

getCharStyle(char_idx)

Parameters:
Name Type Description
char_idx number

The index of a character in this word.

Returns:

A promise that resolves to the style associated with a given character.

Type
PDFNet.textextractorstyle

getCurrentNum()

Returns:

A promise that resolves to the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).

Type
number

getNextWord()

Returns:

A promise that resolves to the next word on the current line.

Type
PDFNet.textextractorword

getNumGlyphs()

Returns:

A promise that resolves to the number of glyphs in this word.

Type
number

getStringLen()

Returns:

A promise that resolves to the number of characters in this word.

Type
number

getStyle()

Returns:

A promise that resolves to predominant style for this word.

Type
PDFNet.textextractorstyle

isValid()

Returns:

A promise that resolves to true if this is a valid word, false otherwise.

Type
boolean