All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
pdftron::PDF::Word Class Reference

#include <TextExtractor.h>

Public Member Functions

int GetNumGlyphs ()
 
Rect GetBBox ()
 
void GetBBox (double out_bbox[4])
 
std::vector< double > GetQuad ()
 
void GetQuad (double out_quad[8])
 
std::vector< double > GetGlyphQuad (int glyph_idx)
 
void GetGlyphQuad (int glyph_idx, double out_quad[8])
 
Style GetCharStyle (int char_idx)
 
Style GetStyle ()
 
int GetStringLen ()
 
const UnicodeGetString ()
 
Word GetNextWord ()
 
int GetCurrentNum ()
 
bool IsValid ()
 
bool operator== (const Word &) const
 
bool operator!= (const Word &) const
 
 Word ()
 

Detailed Description

TextExtractor::Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor::Style).

Definition at line 400 of file TextExtractor.h.

Constructor & Destructor Documentation

pdftron::PDF::Word::Word ( )

Member Function Documentation

Rect pdftron::PDF::Word::GetBBox ( )
Parameters
out_bboxThe bounding box for this word (in unrotated page coordinates).
Note
To account for the effect of page '/Rotate' attribute, transform all points using page.GetDefaultMatrix().
void pdftron::PDF::Word::GetBBox ( double  out_bbox[4])
Style pdftron::PDF::Word::GetCharStyle ( int  char_idx)
Parameters
char_idxThe index of a character in this word.
Returns
The style associated with a given character.
int pdftron::PDF::Word::GetCurrentNum ( )
Returns
the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).
std::vector<double> pdftron::PDF::Word::GetGlyphQuad ( int  glyph_idx)
Parameters
glyph_idxThe index of a glyph in this word.
out_quadThe quadrilateral representing a tight bounding box for a given glyph in the word (in unrotated page coordinates).
void pdftron::PDF::Word::GetGlyphQuad ( int  glyph_idx,
double  out_quad[8] 
)
Word pdftron::PDF::Word::GetNextWord ( )
Returns
the next word on the current line.
int pdftron::PDF::Word::GetNumGlyphs ( )
Returns
The number of glyphs in this word.
std::vector<double> pdftron::PDF::Word::GetQuad ( )
Parameters
out_quadThe quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).
void pdftron::PDF::Word::GetQuad ( double  out_quad[8])
const Unicode* pdftron::PDF::Word::GetString ( )
Returns
the content of this word represented as a Unicode string.
int pdftron::PDF::Word::GetStringLen ( )
Returns
the number of characters in this word.
Style pdftron::PDF::Word::GetStyle ( )
Returns
predominant style for this word.
bool pdftron::PDF::Word::IsValid ( )
Returns
true if this is a valid word, false otherwise.
bool pdftron::PDF::Word::operator!= ( const Word ) const
bool pdftron::PDF::Word::operator== ( const Word ) const

The documentation for this class was generated from the following file: