Class: TextExtractorWord

PDFNet. TextExtractorWord


new TextExtractorWord( [line] [, word] [, uni] [, num] [, cur_num] [, mp_bld])

TextExtractor::Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor::Style).
Parameters:
Name Type Argument Description
line number <optional>
word number <optional>
uni number <optional>
num number <optional>
cur_num number <optional>
mp_bld <optional>
Properties:
Name Type Description
line number
word number
uni number
num number
cur_num number
mp_bld

Methods


<static> create()

Constructor
Returns:
A promise that resolves to an object of type: "PDFNet.TextExtractorWord"
Type
Promise.<PDFNet.TextExtractorWord>

compare(word)

Comparison function. Determines if parameter object is equal to current object.
Parameters:
Name Type Description
word PDFNet.TextExtractorWord
Returns:
A promise that resolves to True if the two objects are equivalent, False otherwise
Type
Promise.<boolean>

getBBox()

Returns:
A promise that resolves to the bounding box for this word (in unrotated page coordinates).
Type
Promise.<PDFNet.Rect>

getCharStyle(char_idx)

Parameters:
Name Type Description
char_idx number The index of a character in this word.
Returns:
A promise that resolves to the style associated with a given character.
Type
Promise.<PDFNet.TextExtractorStyle>

getCurrentNum()

Returns:
A promise that resolves to the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).
Type
Promise.<number>

getNextWord()

Returns:
A promise that resolves to the next word on the current line.
Type
Promise.<PDFNet.TextExtractorWord>

getNumGlyphs()

Returns:
A promise that resolves to the number of glyphs in this word.
Type
Promise.<number>

getQuad()

Returns:
A promise that resolves to the quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).
Type
Promise.<PDFNet.QuadPoint>

getString()

Returns:
A promise that resolves to the content of this word represented as a string. coordinates).
Type
Promise.<string>

getStringLen()

Returns:
A promise that resolves to the number of characters in this word.
Type
Promise.<number>

getStyle()

Returns:
A promise that resolves to predominant style for this word.
Type
Promise.<PDFNet.TextExtractorStyle>

isValid()

Returns:
A promise that resolves to true if this is a valid word, false otherwise.
Type
Promise.<boolean>