public class

HTML2PDF

extends Object
java.lang.Object
   ↳ com.pdftron.pdf.HTML2PDF

Class Overview

'pdftron.PDF.HTML2PDF' is an optional PDFNet Add-On utility class that can be used to convert HTML web pages into PDF documents by using an external module (html2pdf). The html2pdf modules can be downloaded from http: www.pdftron.com/pdfnet/downloads.html. Users can convert HTML pages to PDF using the following operations: - Simple one line static method to convert a single web page to PDF. - Convert HTML pages from URL or string, plus optional table of contents, in user defined order. - Optionally configure settings for proxy, images, java script, and more for each HTML page. - Optionally configure the PDF output, including page size, margins, orientation, and more. - Optionally add table of contents, including setting the depth and appearance. The following code converts a single webpage to pdf import pdftron.PDF.*; import pdftron.SDF.*; PDFDoc pdfdoc = new PDFDoc(); if ( HTML2PDF.convert(pdfdoc, "http://www.gutenberg.org/wiki/Main_Page") == 1 ) pdfdoc.save(outputFile, SDF.SDFDoc.e_remove_unused, null); The following code demonstrates how to convert multiple web pages into one pdf, excluding the background, and with lowered image quality to save space. import pdftron.PDF.*; import pdftron.SDF.*; HTML2PDF converter = new HTML2PDF(); converter.setImageQuality(25); HTML2PDF.WebPageSettings settings = HTML2PDF.WebPageSettings(); settings.setPrintBackground(false); converter.insertFromURL("http://www.gutenberg.org/wiki/Main_Page", settings); PDFDoc pdfdoc = new PDFDoc(); if ( converter.convert(pdfdoc) ) pdfdoc.save(outputFile, SDF.SDFDoc.e_remove_unused, null);

Summary

Nested Classes
class HTML2PDF.Proxy Proxy settings to be used when loading content from web pages. 
class HTML2PDF.TOCSettings Settings for table of contents. 
class HTML2PDF.WebPageSettings Settings that control how a web page is opened and converted to PDF. 
Public Constructors
HTML2PDF()
Default constructor.
Public Methods
static boolean convert(Doc doc, String url, HTML2PDF.WebPageSettings settings)
Convert the HTML document at url and append the results to doc.
boolean convert(Doc doc)
Convert HTML documents and append the results to doc.
static boolean convert(Doc doc, String url)
Convert the HTML document at url and append the results to doc.
void destroy()
Frees the native memory of the object.
void dumpOutline(String xml_file)
Save outline to a xml file.
int getHTTPErrorCode()
Return the largest HTTP error code encountered during conversion

Note: This function will only return a useful result after Convert has been called.

String getLog()
Get results of conversion, including errors and warnings, in human readable form.
void insertFromHtmlString(String html)
Convert HTML encoded in string.
void insertFromHtmlString(String html, HTML2PDF.WebPageSettings settings)
Convert HTML encoded in string.
void insertFromURL(String url)
Add a web page to be converted.
void insertFromURL(String url, HTML2PDF.WebPageSettings settings)
Add a web page to be converted.
void insertTOC(HTML2PDF.TOCSettings settings)
Add a table of contents to the produced PDF.
void insertTOC()
Add a table of contents to the produced PDF.
void setCookieJar(String path)
Path of file used for loading and storing cookies.
void setDPI(int dpi)
Change the DPI explicitly for the output PDF.
void setImageDPI(int dpi)
Maximum DPI to use for images in the generated PDF.
void setImageQuality(int quality)
JPEG compression factor to use when generating PDF.
void setLandscape(boolean enable)
Set page orientation for output PDF.
void setMargins(String top, String bottom, String left, String right)
Set margins of generated PDF.
static void setModulePath(String path)
Set the first location that PDFNet will look for the html2pdf module.
void setOutline(boolean enable, int depth)
Add bookmarks to the PDF.
void setPDFCompression(boolean enable)
Use loss less compression to create PDF.
void setPaperSize(int size_type)
Set paper size of output PDF
void setPaperSize(String width, String height)
Manually set the paper dimensions of the produced PDF.
void setQuiet(boolean quiet)
Display HTML to PDF conversion progress, warnings, and errors, to stdout.
[Expand]
Inherited Methods
From class java.lang.Object

Public Constructors

public HTML2PDF ()

Default constructor.

Public Methods

public static boolean convert (Doc doc, String url, HTML2PDF.WebPageSettings settings)

Convert the HTML document at url and append the results to doc. html2pdf module must be located in the working directory, or with the PDFNetC library.

Note: If you wish to convert more than one web page, or to setup callback handlers, you need to use an instance of HTML2PDF.

Parameters
doc - Target PDF to which converted HTML pages will be appended to.
url - HTML page, or relative path to local HTML page, that will be converted to PDF format.
settings - Modify how the web page is loaded and converted.
Returns
  • true if successful, otherwise false. Use GetHttpErrorCode for possible HTTP errors.

public boolean convert (Doc doc)

Convert HTML documents and append the results to doc. html2pdf module must be located in the working directory, or with the PDFNetC library.

Note: Use insertFromURL and InsertFromHtmlString to add HTML documents to be converted.

Parameters
doc - Target PDF to which converted HTML pages will be appended to.
Returns
  • true if successful, otherwise false. Use GetHttpErrorCode for possible HTTP errors.

public static boolean convert (Doc doc, String url)

Convert the HTML document at url and append the results to doc. html2pdf module must be located in the working directory, or with the PDFNetC library.

Note: If you wish to convert more than one web page, or to setup callback handlers, you need to use an instance of HTML2PDF.

Parameters
doc - Target PDF to which converted HTML pages will be appended to.
url - HTML page, or relative path to local HTML page, that will be converted to PDF format.
Returns
  • true if successful, otherwise false. Use GetHttpErrorCode for possible HTTP errors.

public void destroy ()

Frees the native memory of the object. This can be explicity called to control the deallocation of native memory and avoid situations where the garbage collector does not free the object in a timely manner.

public void dumpOutline (String xml_file)

Save outline to a xml file.

Parameters
xml_file - Path of where xml data representing outline of produced PDF should be saved to.

public int getHTTPErrorCode ()

Return the largest HTTP error code encountered during conversion

Note: This function will only return a useful result after Convert has been called.

Returns
  • the largest HTTP code greater then or equal to 300 encountered during loading of any of the supplied objects, if no such error code is found 0 is returned.

public String getLog ()

Get results of conversion, including errors and warnings, in human readable form.

Returns
  • String containing results of conversion.

public void insertFromHtmlString (String html)

Convert HTML encoded in string.

Parameters
html - String containing HTML code.

public void insertFromHtmlString (String html, HTML2PDF.WebPageSettings settings)

Convert HTML encoded in string.

Parameters
html - String containing HTML code.
settings - How the HTML content described in html is loaded.

public void insertFromURL (String url)

Add a web page to be converted. A single URL typically results in many PDF pages.

Parameters
url - HTML page, or relative path to local HTML page

public void insertFromURL (String url, HTML2PDF.WebPageSettings settings)

Add a web page to be converted. A single URL typically results in many PDF pages.

Parameters
url - HTML page, or relative path to local HTML page
settings - How the web page should be loaded and converted

public void insertTOC (HTML2PDF.TOCSettings settings)

Add a table of contents to the produced PDF.

Parameters
settings - Settings for the table of contents.

public void insertTOC ()

Add a table of contents to the produced PDF.

public void setCookieJar (String path)

Path of file used for loading and storing cookies.

Parameters
path - Path to file used for loading and storing cookies.

public void setDPI (int dpi)

Change the DPI explicitly for the output PDF.

Note: This has no effect on X11 based systems.

Note: Results also depend on SetSmartShrinking.

Parameters
dpi - Dots per inch, e.g. 80.

public void setImageDPI (int dpi)

Maximum DPI to use for images in the generated PDF.

Parameters
dpi - Maximum dpi of images in produced PDF, e.g. 80.

public void setImageQuality (int quality)

JPEG compression factor to use when generating PDF.

Parameters
quality - Compression factor, e.g. 92.

public void setLandscape (boolean enable)

Set page orientation for output PDF.

Parameters
enable - If true generated PDF pages will be orientated to landscape, otherwise orientation will be portrait.

public void setMargins (String top, String bottom, String left, String right)

Set margins of generated PDF.

Note: Supported units are mm, cm, m, in, pica(pc), pixel(px) and point(pt).

Parameters
top - Size of the top margin, e.g. "2cm".
bottom - Size of the bottom margin, e.g. "2cm".
left - Size of the left margin, e.g. "2cm".
right - Size of the right margin, e.g. "2cm".

public static void setModulePath (String path)

Set the first location that PDFNet will look for the html2pdf module.

Parameters
path - A folder or file path. If non-empty, PDFNet will only look in path for the html2pdf module, otherwise it will search in the default locations for the module.

public void setOutline (boolean enable, int depth)

Add bookmarks to the PDF.

Parameters
enable - If true bookmarks will be generated for the produced PDF.
depth - Maximum depth of the outline (e.g. 4).

public void setPDFCompression (boolean enable)

Use loss less compression to create PDF.

Parameters
enable - If true loss less compression will be used to create PDF.

public void setPaperSize (int size_type)

Set paper size of output PDF

Parameters
size_type - Paper size to use for produced PDF. See pdftron.PDF.PrinterMode for available types.

public void setPaperSize (String width, String height)

Manually set the paper dimensions of the produced PDF.

Note: Supported units are mm, cm, m, in, pica(pc), pixel(px) and point(pt).

Parameters
width - Width of the page, e.g. "4cm".
height - Height of the page, eg. "12in".

public void setQuiet (boolean quiet)

Display HTML to PDF conversion progress, warnings, and errors, to stdout.

Note: You can get the final results using getLog.

Parameters
quiet - If false, progress information is sent to stdout during conversion.