All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
pdftron::PDF::Word Class Reference

#include <TextExtractor.h>

Public Member Functions

int GetNumGlyphs ()
Rect GetBBox ()
void GetBBox (double out_bbox[4])
std::vector< double > GetQuad ()
void GetQuad (double out_quad[8])
std::vector< double > GetGlyphQuad (int glyph_idx)
void GetGlyphQuad (int glyph_idx, double out_quad[8])
Style GetCharStyle (int char_idx)
Style GetStyle ()
int GetStringLen ()
const UnicodeGetString ()
Word GetNextWord ()
int GetCurrentNum ()
bool IsValid ()
bool operator== (const Word &) const
bool operator!= (const Word &) const
 Word ()

Detailed Description

TextExtractor::Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor::Style).

Definition at line 400 of file TextExtractor.h.

Constructor & Destructor Documentation

pdftron::PDF::Word::Word ( )

Member Function Documentation

Rect pdftron::PDF::Word::GetBBox ( )
out_bboxThe bounding box for this word (in unrotated page coordinates).
To account for the effect of page '/Rotate' attribute, transform all points using page.GetDefaultMatrix().
void pdftron::PDF::Word::GetBBox ( double  out_bbox[4])
Style pdftron::PDF::Word::GetCharStyle ( int  char_idx)
char_idxThe index of a character in this word.
The style associated with a given character.
int pdftron::PDF::Word::GetCurrentNum ( )
the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).
std::vector<double> pdftron::PDF::Word::GetGlyphQuad ( int  glyph_idx)
glyph_idxThe index of a glyph in this word.
out_quadThe quadrilateral representing a tight bounding box for a given glyph in the word (in unrotated page coordinates).
void pdftron::PDF::Word::GetGlyphQuad ( int  glyph_idx,
double  out_quad[8] 
Word pdftron::PDF::Word::GetNextWord ( )
the next word on the current line.
int pdftron::PDF::Word::GetNumGlyphs ( )
The number of glyphs in this word.
std::vector<double> pdftron::PDF::Word::GetQuad ( )
out_quadThe quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).
void pdftron::PDF::Word::GetQuad ( double  out_quad[8])
const Unicode* pdftron::PDF::Word::GetString ( )
the content of this word represented as a Unicode string.
int pdftron::PDF::Word::GetStringLen ( )
the number of characters in this word.
Style pdftron::PDF::Word::GetStyle ( )
predominant style for this word.
bool pdftron::PDF::Word::IsValid ( )
true if this is a valid word, false otherwise.
bool pdftron::PDF::Word::operator!= ( const Word ) const
bool pdftron::PDF::Word::operator== ( const Word ) const

The documentation for this class was generated from the following file: