• Download Trial
  • Purchase
  • Contact Us
  • Login

ProductsGreat pdf developer solutions
  •  
  • PDFNet SDK
  • SilverDox SDK
  • PDF2Image
  • PDF2Text
  • XPSConvert
  • PDF2XPS
  • PDF2SVG
  • PDF PageMaster
  • PDF/A Manager
  • PDFSecure
  • PDF CosEdit
SupportDeveloper 2 developer support
  •  
  • Annual Maintenance Subscription
  • Technical Support & Resources
  • Professional Services
  • Support FAQ
ResourcesCommunity & developer resources
  •  
  • PDFTron Labs
  • Standards
  • PDF & Environment
  • Industry News
  • Newsletter
  • Whitepapers/Datasheets
Why PDFTronTrusted pdf experts with great solutions
  •  
  • Benefits
  • Our Customers
  • Testimonials
About UsThe story behind the company
  • News & Press
  • Contact Us
  • Careers
  • CO-OP &
    Internship Opportunities
  • Partners/Alliances
  • Resellers

Home // Products // PDFNet SDK // FAQ

  • Products
  • PDFNet SDK
  • Overview
  • Benefits
  • Features
  • Mobile SDK
  • Support
  • What's New
  • FAQ
  • Documentation
  • Knowledge Base
  • API Reference
  • Javadocs
  • Sample Code
  • Forum
  • Download Trial
  • SilverDox SDK
  • PDF2Image
  • PDF2Text
  • XPSConvert
  • PDF2XPS
  • PDF2SVG
  • PDF PageMaster
  • PDF/A Manager
  • PDFSecure
  • PDF CosEdit
Sub Navigation

Browse by Functionality

PDFNet SDK

FAQ

Search PDFNet SDK Knowledge Base or Forum.

General Questions

+What PDF functionality is provided in PDFNet?

PDFNet can be used to access a wide range of PDF functionality. The most commonly used include:

PDF Creation. Using PDFNet PDF can be created dynamically and delivered on the fly (e.g. on web and database servers). The library can also be integrated with client-based applications in order to provide enhanced PDF output that can’t be produced using generic PDF writers (such as PDF print drivers or PostScript converters).

Editing. PDFNet provides a comprehensive API that can be used to edit all aspects of PDF document. Using PDFNet it is easy to:

  • Append and assemble PDF documents
  • Merge specific PDF pages from multiple documents
  • Delete or rearrange pages
  • Edit page contents
  • Add/remove/edit images, text, and vector graphics.
  • Edit fonts, color spaces, line styles, and other attributes.
  • Edit document metadata
  • Crop and rotate pages
  • Edit bookmarks and page annotations.
  • Edit security settings
  • Edit every aspect of a document using powerful SDF API.

Content extraction. PDFNet can be used to extract text, images, fonts, ICC color profiles, embedded files etc. Complete content extraction features make PDFNet a solid foundation for PDF viewers, editors, document converters, and software RIPs (Raster Image processors)

PDF rasterization, viewing, and printing. Using PDFNet, client applications can take advantage of interactive PDF display, while server-based applications can generate images and thumbnails on the fly.

Fonts. PDFNet provides a unified and an easy to use API that can be used to embed, extract, and process all font formats supported by PDF (i.e. Type1, TryeType, Type3, Multiple Master, CFF, CIDType 0, and CID Type1).

Forms. PDFNet can be used to read, write, and edit PDF forms.

Prepress workflows. PDFNet supports prepress workflows by providing solid infrastructure and utilities for color conversion, separation, preflighting, and imposition operations.

Optimization and linearization

Compression

Security and encryption. Set new security handlers and edit or remove existing security.

For a more complere comprehensive feature listing, please take a look at PDFNet Feature Chart

+Does PDFNet require any third-party components?

All PDFTron software applications and components are stand-alone products. Therefore, there are no dependencies on third-party components or software.

+Can I use the PDFNet in a server environment?

Yes. PDFNet is an ideal solution for integrating PDF capabilities into server
applications. Using the PDFNet, your applications can dynamically generate, manipulate, and print PDF documents in server environments.

+Does PDFNet support multi-threaded applications?

PDFNet is fully safe for multithreading and can be used in applications that run multiple concurrent PDFNet threads.

Support for multithreading is document based. You can spawn a separate thread for each document, however only one thread can operate on a single document at the same time.

+Does PDFNet work well on hyper-threaded and multi-processor machines?

PDFNet works on hyper-threaded and multi-processor machines. The only difference is that if you plan to run PDFNet on a multi-processor machine you need to purchase a separate license for each physical CPU.

+Is PDFNet available on Linux?

Yes. PDFNet SDK Java/C/C++ library is available on Linux, Mac OSX, Solaris, and Windows. On Windows PDFNet is also available as a .Net component.

+Is PDFNet available for Java?

PDFNet SDK is available for Java on all supported platfroms. As a starting point you may want to browse online Javadoc-s or Java sample projects.

+What platforms does PDFNet support?

PDFNet is available on Windows (.NET /Java/C/C++), Mac OSX (Java/C/C++), Solaris (Java/C/C++), and Linux (Java/C/C++). PDFNet SDK is also available on Windows Mobile and other embedded systems.

+Does PDFNet support linearization (fast web view)?

Using PDFNet you can save any existing or newly created PDF document in linearized (fast web view) format.

In order to provide good performance over relatively slow communication links, PDFNet can generate PDF documents with linearized objects and hint tables that can allow a PDF viewer application to download and view one page of a PDF file at a time, rather than requiring the entire file (including fonts and images) to be downloaded before any of it can be viewed.

The only thing required to save a document in linearized (fast web view) format is to pass 'Doc.SaveOptions.e_linearized' flag in the save method.

+Does PDFNet support PDF/X or PDF/E?

Standards such as PDF/X and PDF/E define a subset of the PDF specification designed for a specific industry (e.g. publishing, engineering, etc). PDFNet allows processing and generation of any PDF document, and it does not prevent the user from creating a valid document that includes features that are not listed in one of PDF subsets. To make your application PDF/X (or PDF/E) compliant, you need to make sure that you are using only PDF features allowed by a given subset.

+Is PDFNet for .NET a true .Net component or a COM wrapper?

PDFNet is not a single SDK, but a family of SDK-s which are available on different platforms and programming environments. PDFNet for .Net is a true .NET component written in managed C++ that can be used from any .NET language such as C#, VB.Net, and managed C++. PDFNet for .Net is not a wrapper around another COM componet. Because PDFNet for .Net is written in managed C++, applications can take advantage of significant performance gains.

+Can I use PDFNet to generate PDF output from my application?

If you have layout and data for output pages, you can output PDF files using the PDFNet. As a starting point, you may want to take a look at ElementBuilder sample project.

+How much does PDFNet cost?

See PDFNet licensing page for details.

+Is there developer support available?

PDFNet is fully supported by PDFTron Systems and we have a dedicated support team. In addition, we have a regular maintenance release cycle. We ship new maintenance releases to licensees every four to six weeks. We also have major feature releases that are synchronized with major revisions of the PDF specification. As with license fees, support and maintenance fees are also based on how much functionality you use.

+How does PDFNet compare to PDF PageMaster/PDF Secure?

PDF PageMaster and PDF Secure are implemented using PDFNet. PDFNet has a more complex API that can be used to read, write, and edit every aspect of a PDF document.

+Do you have an API Reference for .NET version?

The API for C++ and .Net versions is identical so the same reference is used for both C++ and .Net languages. Because the API is identical it is very easy to port PDFNet code between managed and unmanaged languages. If you are developer with Java background, you may also want to refer to Javadoc version of the API Reference.

API Questions

Common Problems / Errors / Exceptions

+Exception: PDFNet is not initialized; Unknown Exception;

Make sure that you initialized PDFNet using PDFNet.Initialize(). PDFNet.Initialize()/Terminate() should be called only once per process session. You should not call PDFNet.Initialize()/Terminate() for each PDFNet thread.

+Exception (.Net & W2K): GDIPLUS.dll can't be found; Unknown Exception;

Although GDIPLUS.dll is a standard part of .NET framework, on some W2K systems it is not located in DLL search path. You can use Dependency Walker (www.dependencywalker.com) to check whether PDFNET.DLL can locate GDIPLUS.dll on your machine. If there are any missing dependencies, they will be highlighted in red. To solve the problem you can add the folder containing GDIPLUS.dll to the 'path' environment variable, copy GDIPLUS.dll to 'Windows/System32' folder, etc.

+Exception: A parsing error or a Filter exception encountered while reading or editing any part of the document.

The most likely cause of an error that occurs at the start of content extraction is un-initialized security handler. To initialize the security handler call doc.InitializeSecurityHandler() just after opening a document. This method has no side effects on documents that are not encrypted so you can make a convention to always invoke doc.InitializeSecurityHandler() after constructing a document. You can use doc.IsEncrypted() method in case you would like to verify that the document is encrypted. If the error occurs even after SecurityHandler is initialized, please contact support for further assistance.

+Net Error: "This application cannot run using the active version of the Microsoft .NET Runtime"

This error may occur if you run PDFNet for Microsoft.Net 1.1x or higher on a machine running Microsoft.Net version 1.0x. If this is the case you need to download PDFNet for Microsoft.Net 1.0x or you need to upgrade the virtual machine.

In Microsoft.NET version 1.0x there were some critical bugs that were fixed in versions 1.1x and higher. We highly recommend upgrading to versions 1.1x or higher, but in cases where this is not an option, use PDFNet for Microsoft.NET versions 1.0x.

+C++ Error: unexpected token or syntax error in headers

The most likely cause of unexpected token or syntax error occurring in headers is macro misuse declared in header files before PDFNet headers.

For example, windows.h declares macro called GetMessage which will conflict with PDFNet Exception::GetMessage() method. To avoid this error either #include affected PDFNet headers before the header(s) introducing the offending macro(s) or use #undef directive before PDFNet headers.

PDF

+How do I get document's Title/Keywords/Authors property?

The simplest way to access document's metadata (e.g. author, title, keywords, etc.) is using PDFDocInfo class. For example,

PDFDocInfo info = mydoc.GetDocInfo();
String title = info.GetTitle();
info.SetTitle("My Title");
etc...

Alternatively you can access document's metadata using SDF/Cos API:

PDFDoc doc = new PDFDoc(...);
doc.InitSecurityHandler();
Obj trailer = doc.GetTrailer(); // Get the trailer
Obj info = trailer.FindObj("Info");
if (info != null) {
  // Get 'Title'/'Author'/'Keywords'/'Subject'... 
  // entry, if available
  Obj title_obj = info.FindObj("Title");
  if (title_obj != null) 
  { // Note: In some documents these strings are encoded
    // using PDF text encoding.
    String title = title_obj.GetString();
    title_obj.SetString("My Title...");
  }
  else {
    info.PutString("Title", "My Title...");
  }
}
+How do I add a text-link or a hyperlink to a pdf page?

In PDF format, hyperlinks are represented as a special type of
annotations. You can create a new link annotation as follows (assuming
C#):

// Create a 'goto' link to page #3 in the same document.
Action link_action = Action.CreateGoto(
  Destination.CreateFitH(doc.GetPage(3), 0));
Annot link = Annot.CreateLink(doc.GetSDFDoc(), 
  new Rect(85, 458, 503, 502), link_action);
page.AnnotPushBack(link);

The first parameter is the document where the link should be created,
the second parameter is the link region (in PDF user coordinate
system), and the last parameter is the link action. The last line add
the annotation to a given page.

The above code shows how to create an 'intra-document' link. To create
a hyperlink that can open a URL in the default web browser you could use the following code snippet:

// Create a hyperlink... 
Annot hyperlink = Annot.CreateLink( doc, new Rect(85, 570, 503, 524), Action.CreateURI(doc, "http://www.pdftron.com")); page.AnnotPushBack(hyperlink);

For information of how to add a text link that includes text, images, or vector art please search PDFNet Knowledge Base (http://groups.google.com/group/pdfnet-sdk/) using "How do I add a text-link" as a keyword.

Go back to the Top

How do I get a page number for a given page?

To get the page number from a given page, use Page.GetIndex() method.

+How do I get a page number for a given page?

To get the page number from a given page, use Page.GetIndex() method.

+How do I get the paper size for a given page?

You can use Page.GetMediaBox() to obtain the dimensions of the media box for the page. For example,

Rect bbox = page.GetMediaBox();
bbox.Normalize()

// the width and height of the page in page units
width = bbox.Width();
height = bbox.Height();

One page unit is 1/72 of an inch. For a 'letter' size page (8.5 x 11 inches) the dimensions will be:
width = 612 units = 612 * 1/72 = 8.5 inches
height = 792 units = 792 * 1/72 = 11 inches

Go back to the Top

How do I create a new page that is the same size as an existing page?

You can create a new page that is the same size as an existing page as follows:

Rect media_box = existing_page.GetMediaBox();
Page new_page = doc.PageCreate(media_box);

A slightly harder but more powerful technique to accomplish the same task is using Cos/SDF API:

Page new_page = doc.PageCreate();
Obj pg = existing_page.GetSDFObj();
new_page.GetSDFObj().Put("MediaBox", pg.Get("MediaBox").Value());

This example can be trivially extended so you can copy arbitrary entries from an existing page (i.e. you can copy over the crop/bleed/art box, page resources, annotations, etc.)

+How do I create a new page that is the same size as an existing page?

You can create a new page that is the same size as an existing page as follows:

Rect media_box = existing_page.GetMediaBox();
Page new_page = doc.PageCreate(media_box);

A slightly harder but more powerful technique to accomplish the same task is using Cos/SDF API:

Page new_page = doc.PageCreate();
Obj pg = existing_page.GetSDFObj();
new_page.GetSDFObj().Put("MediaBox", pg.Get("MediaBox").Value());

This example can be trivially extended so you can copy arbitrary entries from an existing page (i.e. you can copy over the crop/bleed/art box, page resources, annotations, etc.)

+Why does GetFontSize() always returns 1.0?

GetFontSize() returns the correct value. You can use CosEdit to check this; somewhere in the page content stream you will find /Fn 1 Tf operator. If you want to get the font size as it appears on the page you need to scale GetFontSize() with text matrix (Element.GetTextMatrix()) , as well as, current transformation matrix (CTM):

double scale_factor = Math.sqrt(mtx.m_b*mtx.m_b + mtx.m_d*mtx.m_d);
double page_font_sz = gs.GetFontSize() * scale_factor;

For a complete example on how to get font size in the user space please refer to ElementReaderAdv test project in Samples folder.

+How do I extract words from text-runs?

Text runs (e.g. elements of type e_text) represent a stream of text, but text-runs do not directly correspond to words. For example, you may have a single word that consist of letters in various fonts and styles. In this case each letter would correspond to a separate text-run. Also you may encounter text-runs that contain multiple words separated by spaces.

The most straightforward approach to extract words from text-runs is using pdftron.PDF.TextExtractor class (as shown in TextExtract sample project).

In case TextExtractor does not meet all of your requirements you can also implement your own word recognizer using the low-level text APIs.

+How do I extract content in the reading order?

Q: While extracting content from a PDF document, the sequence represents the painting order for content and not the order as it is seen on the screen. How do I extract text in the reading order, and not in the sequence given in PDF?

A: Unfortunately, most PDF documents do not include enough logical structure to extract the reading order. As a result, it is usually necessary to reconstruct the reading order based on the content positioning on the page. To obtain the positioning information for every graphical element on the page, you could use element.GetBBox(rect) method. Using this information it is possible to build a structure that can be used to extract the content in a specific reading order. Also starting with version 4, PDFNet SDK includes high-level APIs (TextExtractor, SElement, STree, etc) that can be used to automatically reconstruct logical structure for any PDF document.

+How do I get absolute/relative text and character positioning?

Relative text positioning coordinates can be accessed using CharIterator.

Absolute text positioning is a function of: Element.GetCTM(),
Element.GetTextMatrix(), and relative character positioning information (i.e.char_itr.Current().x, char_itr.Current().y).

The simplest approach to obtain the bounding box (in absolute or PDF user coordinate system) for a given text run is using element.GetBBox(rect) method.

To obtain absolute text positioning information for each character in the text run, you need to concatenate the current transformation matrix (ElementReader.GetCTM()) with the current text matrix (ElementReader.GetTextMatrix()). To get absolute character
positioning information you would multiply the resulting matrix with the relative character position (char_itr.Current().x, char_itr.Current().y)
from CharIterator.

Please refer to section '5.3.3 Text Space Details' in the PDF Reference
Manual for more details on how text coordinates are transformed into
PDF user space.

// CTM (current transformation matrix).
Matrix2D ctm = element.GetCTM();
Matrix2D text_mtx = element.GetTextMatrix();
double x, y;
int char_code;

CharIterator end = element.CharEnd();
for (CharIterator itr = element.CharBegin(); itr.HasNext(); itr.Next())
{

x = itr.Current().x; // relative character positioning information
y = itr.Current().y;

// To get the absolute character coordinate you need to concatenate
// the current text matrix (CTM) with current text matrix
// and then multiply relative character postitioning coordinate.
// Matrix2D mtx = ctm * text_mtx;
// mtx.Mult(x, y); // (x, y) is now the absolute coordinate.
}

For a complete example on how to get text and character positioning information please refer to ElementReaderAdv test project in Samples folder.

+How do I remove embedded fonts?

Given a font object you can remove embedded font streams as follows:

// Using C#:
Obj fd = myfont.GetDescriptor();
if (fd == null) return; // If null, the font is not ebedded
fd.Erase("FontFile");
fd.Erase("FontFile2");
fd.Erase("FontFile3");
...
doc.Save(..., Doc.SaveOptions.e_linearized);

To find all fonts in the document, you can either traverse all page resources (i.e. 'Font' entry in the page resource dictionary), or iterate over all document objects. For example:

... Init PDFNet ...
PDFDoc doc = new PDFDoc("in.pdf");
doc.InitSecurityHandler();
SDFDoc cos_doc = doc.GetSDFDoc();
int num_objs = cos_doc.XRefSize();
for (int i=1; i<num_objs; ++i) {
Obj obj = cos_doc.GetObj(i);
if (obj!=null && !obj.IsFree()&& obj.IsDict())
{ // Process only Fonts
DictIterator itr = obj.Find("Type");
if (itr.HasNext() == false ||
itr.Value().GetName() != "Font")
continue;
itr = obj.Find("FontDescriptor");
if (itr.HasNext() == false) continue;
if (!itr.Value().IsDict()) continue;
Obj fd = itr.Value();
fd.Erase("FontFile");
fd.Erase("FontFile2");
fd.Erase("FontFile3");
}
}
doc.Save(...)
doc.Close();
+How do I find if a given font is bold or italic?

Given an element with 'e_text' type, you can obtain its high-level font object as follows:

Font font = element.GetGState().GetFont();

To check if the font is italic, you could use font.IsItalic().

You can also obtain all other properties from font descriptor dictionary (see section 5.7 'Font Descriptors' in PDF Reference Manual).

For example (using C#),

Obj fd = font.GetDescriptor();
if (fd != null) {
  double italic_angle = 0, weight=400;
  Obj obj= fd.FindObj("ItalicAngle");
  if (obj != null) {
    italic_angle = obj.GetNumber();
  }

  obj = fd.Find("FontWeight");
  if (obj != null) {
  // A value of 400 indicates a normal weight; 700 indicates bold.
  weight = obj.GetNumber();
  }
}

+How do I create 'PDF Searchable Images'?

You can use PDFNet in order to generate 'PDF Searchable Images'. PDF Searchable Images are created using invisible text drawn on top of scanned images. In order to make invisible text that can be highlighted or searched, you need to set TextRenderingMode flag in the graphics state of the text element (i.e. Element. GetGState(). SetTextRenderMode( GState.TextRenderingMode.e_invisible_text ) ).

+How do I stamp a page?

Using PDFNet you can place watermarks or append new content (such as such as text, logo, or images) using ElementWriter and ElementBuilder as illustrated in the following snippet:

PDFNet.Initialize();
try 
{
  PDFDoc doc = new PDFDoc("my.pdf");
  doc.InitSecurityHandler();

  ElementBuilder eb = new ElementBuilder(); 
  ElementWriter writer = new ElementWriter(); 

  // Get the first page
  Page page = doc.GetPage(1);

  // Begin writing to the page
  writer.Begin(page); 

  // Begin writing a block of text
  Element element = eb.CreateTextBegin(
     Font.Create(doc, 
       Font.StandardType1Font.e_times_roman), 12);
  writer.WriteElement(element);

  string txt = "Hello World!";
  element = eb.CreateTextRun(txt);
  // Scale-up text 5 times and shift it by (0,600)
  element.SetTextMatrix(5, 0, 0, 5, 0, 600);
  writer.WriteElement(element);

  // Set the spacing between lines
  element.GetGState().SetLeading(15);
  writer.WriteElement(eb.CreateTextNewLine());

  // Draw the same text string; this time stroked.
  element = eb.CreateTextRun(txt);
  GState gstate = element.GetGState(); 
  gstate.SetTextRenderMode(
    GState.TextRenderingMode.e_stroke_text);
  gstate.SetCharSpacing(-1.25);
  gstate.SetWordSpacing(-1.25);
  writer.WriteElement(element);

  // Finish the block of text
  writer.WriteElement(eb.CreateTextEnd());
  writer.End(); 

  doc.Save("out.pdf", 0);
  doc.Close();
}
catch (PDFNetException e) {
  Console.WriteLine(e.Message);
}

The following code snippet illustrates how to stamp all pages in the document with a "Hello World!" string.

PDFDoc doc = new PDFDoc("in.pdf");
doc.InitSecurityHandler();

ElementBuilder eb = new ElementBuilder();
ElementWriter writer = new ElementWriter();

PageIterator itr=doc.GetPageIterator();
for (; itr.HasNext(); itr.Next())
{
  writer.Begin(itr.Current());
  Element element = eb.CreateTextBegin(
   Font.Create(doc,     
    Font.StandardType1Font.e_times_roman),64);
  writer.WriteElement(element);
  element = eb.CreateTextRun("Hello World!");
  // Position the text run
  element.SetTextMatrix(1, 0, 0, 1, 20, 20);
  writer.WriteElement(element);
  writer.WriteElement(eb.CreateTextEnd());
  writer.End(); // Save the changes
}

doc.Save("out.pdf", 0);
doc.Close();

For a longer code example, illustrating the use of ElementBuilder and ElementWriter, please take a look at ElementBuilder sample project.

Using PDFNet it is also possible to create watermark annotations using the similar procedure as outlined above. You would use ElementBuilder/ElementWriter to create new appearance stream and Annot class to create the annotation object.

+How do I add a watermark to a page?

Using PDFNet you can place watermarks or append new content (such as such as text, logo, or images) using ElementWriter and ElementBuilder as illustrated in the following snippet:

PDFNet.Initialize();
try 
{
  PDFDoc doc = new PDFDoc("my.pdf");
  doc.InitSecurityHandler();

  ElementBuilder eb = new ElementBuilder(); 
  ElementWriter writer = new ElementWriter(); 

  // Get the first page
  Page page = doc.GetPage(1);

  // Begin writing to the page
  writer.Begin(page); 

  // Begin writing a block of text
  Element element = eb.CreateTextBegin(
     Font.Create(doc, 
       Font.StandardType1Font.e_times_roman), 12);
  writer.WriteElement(element);

  string txt = "Hello World!";
  element = eb.CreateTextRun(txt);
  // Scale-up text 5 times and shift it by (0,600)
  element.SetTextMatrix(5, 0, 0, 5, 0, 600);
  writer.WriteElement(element);

  // Set the spacing between lines
  element.GetGState().SetLeading(15);
  writer.WriteElement(eb.CreateTextNewLine());

  // Draw the same text string; this time stroked.
  element = eb.CreateTextRun(txt);
  GState gstate = element.GetGState(); 
  gstate.SetTextRenderMode(
    GState.TextRenderingMode.e_stroke_text);
  gstate.SetCharSpacing(-1.25);
  gstate.SetWordSpacing(-1.25);
  writer.WriteElement(element);

  // Finish the block of text
  writer.WriteElement(eb.CreateTextEnd());
  writer.End(); 

  doc.Save("out.pdf", 0);
  doc.Close();
}
catch (PDFNetException e) {
  Console.WriteLine(e.Message);
}

The following code snippet illustrates how to stamp all pages in the document with a "Hello World!" string.

PDFDoc doc = new PDFDoc("in.pdf");
doc.InitSecurityHandler();

ElementBuilder eb = new ElementBuilder();
ElementWriter writer = new ElementWriter();

PageIterator itr=doc.GetPageIterator();
for (; itr.HasNext(); itr.Next())
{
  writer.Begin(itr.Current());
  Element element = eb.CreateTextBegin(
   Font.Create(doc,     
    Font.StandardType1Font.e_times_roman),64);
  writer.WriteElement(element);
  element = eb.CreateTextRun("Hello World!");
  // Position the text run
  element.SetTextMatrix(1, 0, 0, 1, 20, 20);
  writer.WriteElement(element);
  writer.WriteElement(eb.CreateTextEnd());
  writer.End(); // Save the changes
}

doc.Save("out.pdf", 0);
doc.Close();

For a longer code example, illustrating the use of ElementBuilder and ElementWriter, please take a look at ElementBuilder sample project.

Using PDFNet it is also possible to create watermark annotations using the similar procedure as outlined above. You would use ElementBuilder/ElementWriter to create new appearance stream and Annot class to create the annotation object.

+How do I embed raster images such a TIFF, PNG, GIF, or JPEG?

PDFNet allows direct embedding of various raster image images as well as GDI+ Bitmaps. For a concrete sample code please take a look at AddImage sample project.

+How do I get the image resolution and DPI?

You can get the image resolution using Element/Image.GetImageWidth() and Element/Image.GetImageHeight() methods. If you want to calculate DPI of the image as it appears on the target medium (i.e. paper) you need to take into account the current transformation matrix (CTM). Use Element.GetCTM() method in order to get the current transformation matrix (CTM).

If the CTM does not include rotation or skew the image will be positioned at (GetCTM().m_h, GetCTM().m_v) and will be GetCTM().m_a units wide and GetCTM().m_d units high. Note that one unit in the user space is equal to 1/72 of an inch.

+How do I find image rotation/position?

In PDF the image can be rotated by any degree. The image can also be stretched, skewed, etc. The transformation is specified using the Current Transformation Matrix (CTM) which can be accessed using the Element.GetCTM() method.

Use the following code snippet (pseudocode) to calculate image rotation angle (in radians):

double GetRotation(Matrix2D& mtx) {
  double x1=0, y1=0, x2=1, y2=0;
  mtx.Mult(x1, y1);
  mtx.Mult(x2, y2);
  return atan2(y2-y1, x2-x1);
}

The position of the image on the page is given using the translation component of the matrix (i.e mtx.m_h, mtx.m_v).

+How do I place one half of an image on one PDF page and the other half on a different page?

You can place one half of the image on one PDF page and the other half of the image on a second PDF page as follows (using C# pseudo-code):

Image img = Image.Create(doc.GetSDFDoc(), data, width, 
  height, 8, ColorSpace.CreateDeviceRGB(), 
  Image.InputFilter.e_jpeg);
ElementBuilder eb = new ElementBuilder(); // Create page #1 ------------- Page page = doc.PageCreate(); writer.Begin(page); // Use a clipping path in order to show only a // portion of the image // Save the graphics state // so that the clipping path does not affect // other graphics on the page. writer.WriteElement(eb.CreateGroupBegin()); // Create a clipping path. eb.PathBegin(); eb.CreateRect (0, 0, 200, 100); Element element = eb.PathEnd(); // this is a clipping path element.SetPathClip(true); element.SetPathStroke(false); element.SetPathFill(false); // Write clip path writer.WriteElement(element); // Place the first half of the image behind the clip path. element = eb.CreateImage(img, new Matrix2D(200, 0, 0, 200, 0, 0)); writer.WritePlacedElement(element); // Restore the graphics state. writer.WriteElement(eb.CreateGroupEnd()); writer.End(); doc.PagePushBack(page); // Create page #2 ------------- page = doc.PageCreate(); writer.Begin(page); writer.WriteElement(eb.CreateGroupBegin()); // Create a clipping path. eb.PathBegin(); eb.CreateRect (0, 100, 200, 100); element = eb.PathEnd(); element.SetPathClip(true); element.SetPathStroke(false); element.SetPathFill(false); writer.WriteElement(element); // Place the second half of the image behind // the clip path. element = eb.CreateImage(img, new Matrix2D(200, 0, 0, 200, 0, 0)); writer.WritePlacedElement(element); writer.WriteElement(eb.CreateGroupEnd()); writer.End(); doc.PagePushBack(page);
+How do I replace an existing image in a PDF with another image?

Using PDFNet you can replace (swap) an image in an existing PDF document as follows:

  1. Find the image that should be replaced (source image). You can do this by enumerating page contents using ElementReader and looking for Elements with type e_image. Another option is to enumerate page image resources directly using SDF/Cos API (e.g. page.GetResourceDict(). Get("XObject").Value() ...).
  2. Create a replacement image using Image img = Image.Create??() methods as illustrated in AddImage sample project.
  3. Swap the two images as follows:
    SDFDoc doc = pdfdoc.GetSDFDoc();
    int img1_objnum = img1.GetSDFObj().GetObjNum();
    int img2_objnum = img2.GetSDFObj().GetObjNum();
    doc.Swap(img1_objnum, img2_objnum);
+How do I rotate/transform an Element?

The following sample code illustrates how to set a transformation matrix on an Image element:

Element* element = eb.CreateImage(Image(...));
double deg2rad = 3.1415926535 / 180.0;
// Translate
Matrix2D mtx = Matrix2D(1, 0, 0, 1, 0, 200);
// Scale
mtx *= Matrix2D(300, 0, 0, 200, 0, 0); 
// Rotate
mtx *= Matrix2D::RotationMatrix( 90 * deg2rad ); 

element->GetGState()->SetTransform(mtx);
writer.WritePlacedElement(element);

The RotationMatrix accepts an angle in radians. Please note that the order of transformations (i.e. matrix multiplications) is stack based. The same convention is used in PostScript, PDF, and OpenGL.

+Does PDFNet support extracting table and list info from PDFs?

PDFNet supports extraction of all content available in PDF document. On
the other hand PDF standard does not directly support abstract constructs such as paragraphs, columns, tables, etc. Because the logical structure is missing in PDF document, the target application would need to analyze and generate logical structure based on the underlying content that is available through PDFNet.

Note that PDF standard supports marked content and so called 'tagged
PDF'. PDFNet can be used to extract marked content and any existing
logical structure. Unfortunately many PDF files are missing tags and
logical structure.

+Why is ElementReader reporting that there are no more Elements on the Page when I can see extra graphics in PDF viewer?

The most likely cause for this behavior is that missing Elements are annotation objects. Annotation are not part of the content stream. Although it is a bad practice, some PDF generators produce PDF content in the form of annotations.

With PDFNet library it is possible to read the appearances of existing annotations in the same way as reading the page content. To process annotation appearances, first obtain annotation array from the Page and initialize ElementReader with annotation's appearance stream (/AP dictionary entry). You can then extract annotation's Elements in the same way as when reading page content.

+How do I embed JavaScript inside a PDF?

You can embed/associate JavaScript with any type of PDF annotation or with the PDF document using SDF/Cos API. For example, the following code snippet creates additional action dictionary and associates it with an existing annotation:

// Create a JavaScript 'Additional Action'
// (see section 8.4.1 'Annotation Dictionaries', 8.5 'Actions',
// and 'JavaScript Actions' on page 668 in PDF Reference Manual
// for details).

Obj js_action = doc.CreateIndirectDict();
js_action.PutName("S", "JavaScript");
js_action.PutString("JS", "alert('Hello World');");

Obj aa_dict = my_annot.GetSDFObj().PutDict("AA");
aa_dict.Put("F", js_action);

---
CosEdit utility can be very useful while you work with SDF/Cos API.

Here is another example of adding JavaScript as a document level action:

Obj root = pdfdoc.GetRoot();
Obj aa = root.PutDict("AA");

Obj ds = pdfdoc.CreateIndirectDict();
Obj ws = pdfdoc.CreateIndirectDict();
Obj dc = pdfdoc.CreateIndirectDict();

aa.Put("DS", ds); // Did Save Action
aa.Put("WS", ws); // Will Save Action
aa.Put("DC", dc); // Document Close Action

ds.PutName("S", "JavaScript");
ds.PutString("JS", "... DidSave JavaScript ....");

ws.PutName("S", "JavaScript");
ws.PutString("JS", "... OnSave JavaScript ....");

dc.PutName("S", "JavaScript");
dc.PutString("JS", "...OnClose JavaScript ....");

For lengthy JavaScript code segments you can also embed JavaScript as SDF streams objects instead of strings. For example:

StdFile embed_file = new StdFile("code.js", StdFile.OpenMode.e_read_mode);
FilterReader mystm = new FilterReader(embed_file);
Obj js_stream = doc.CreateIndirectStream(mystm));
...
js_action.Put("JS", js_stream);

+How do I insert PostScript XObjects into my PDFs?

You can embed PostScript stream in PDF as follows (C# sample):

// Embed a custom stream (file postscript.ps).
StdFile embed_file = new StdFile("postscript.ps",
  StdFile.OpenMode.e_read_mode); 
FilterReader mystm = new FilterReader(embed_file);
Obj ps_stm = doc.CreateIndirectStream(mystm);
ps_stm.PutName("Subtype", "PS"));

// ...
// Then use ElementBuilder and ElementWriter to 
// reference the PostScript stream from a given page:
Element element = builder.CreateForm(ps_stm);
writer.WriteElement(element);

Bookmarks

+How do I insert a new top level bookmark?

To create a root bookmark in documents that don't have any bookmarks/outlines use PDFDoc.AddRootBookmark(mybookmark). To insert a new root bookmark before the existing bookmark, use mybookmark.AddPrev( "Upper Sibling" ). To insert a new root bookmark after the existing bookmark, use mybookmark.AddPrev( "Lower Sibling" ).

+How do I split a PDF based on bookmarks?

The following is a short sample code that illustrates how to split a
document based on bookmarks: PDFBookmarkSplit.cs. You may want to use PDFBookmarkSplit as a starting point for your project or for further customizations to the splitting process.

PDF Split and Merge

+What is the most efficient way to merge PDF pages?

Customers using .NET version of PDFNet and working with large documents can dramatically increase the performance by saving a file to a temporary file instead to a memory buffer. The real performance bottleneck is related to .NET data-marshaling and not PDF merging.

Merging performance can also be increased by merging original documents instead of copying all pages to a new document. Instead of copying all pages to a new document you can simply append or delete pages in the source document. Note that PDFDoc.Save(...) is not altering the original document unless the filename matches the original filename.

Another optimization tip is to use PDFDoc.ImportPages() to efficiently copy a page set from one document to another. See Copying/Merging Pages for details.

+How do I reduce the file size of a merged PDF document?

Q: I am using PDFDoc.PagePushBack() (or PagePushBack/PageInsert) method to combine two or more PDF-s into one. The problem is that the file size of the resulting PDF is too big compared to size of input PDF documents.

A: If you encounter this problem please refer to Copying/Merging Pages section in PDFNet User Manual. The file size can be dramatically reduced by importing page set in the target document using PDFDoc.ImportPages() and then using PDFDoc.PagePushBack() (or PagePushBack/PageInsert) to position the page within document's page sequence.

+How do I merge two (or more) PDF pages into one?

Q: Is it possible to merge two pages stored in two separate PDF files, i.e. a data file and a background file into one file with the text overlayed on the image?

A: Using PDFNet toolkit it is very simple to merge content from several pages into one.

The first step is to import the overly page into the background document. You can then merge page content in two ways.

A) You can read Elements from the overly page using ElementReader and write them using ElementWriter on the background page.
B) You can create Form XObject Element out of the overly page using ElementBuilder and write it on the background page using ElementWriter.

Technique A is illustrated in the following pseudo-code:

PDFDoc over = new PDFDoc("overly.pdf");
PDFDoc back = new PDFDoc("background.pdf");

// Import the overly page into the background doc
PageIterator op_itr = over.PageFind(1);
back.PagePushBack(op_itr.Current());

// Background page
Page bp = back.GetPage(1);
// Overly page
Page op = back.GetPage(2);

// Copy Elements from Overly page to 
// Background page ElementReader 
reader = new ElementReader();
reader.Begin(op);
ElementWriter writer = new ElementWriter();
writer.Begin(bp);
Element element;
while ((element = reader.Next()) != null)
  writer.WriteElement(element);

writer.End();
reader.End();

// You can now optionaly remove the overly page
// back.PageRemove(back.PageFind(2));
back.Save("merged.pdf", 0);

The above code-snippet assumes that the overly is the first page in "overly.pdf" and that "background.pdf" has a single page. It is trivial to extend the sample to an arbitrary case.

+How do I add multiple pages from existing documents to a single page in a new document?

Page imposition is a process of combining pages onto larger sheets to make books, booklets, pamphlets, etc.

Page imposition can be used to arrange/order pages prior to printing or to assemble a 'master' page from several 'source' pages. Using PDFNet API it is possible to write applications that can re-order the pages such that they will display in the correct order when the hard copy pages are compiled and folded correctly.

For an example on how multiple pages can be combined/imposed using PDFNet please take a look at ImpositionTest sample project.

+How do I impose/combine several pages using PDFNet?

Yes, definitely! At PDFTron, we are committed to providing you with all the support and dedicated technical resources you need, even during the evaluation stage of your project, so that you can be 100% sure that PDFTron can provide you with the best solution for the job before you make any financial commitment. This is also one of the reasons why we do not offer refunds of our software.

+How do I split a PDF based on bookmarks?

The following is a short sample code that illustrates how to split a
document based on bookmarks: PDFBookmarkSplit.cs. You may want to use PDFBookmarkSplit as a starting point for your project or for further customizations to the splitting process.

Forms

+Why are PDF form fields blank (or unchanged) in Acrobat after I populate the form using PDFNet forms API?

In PDF, Field's value is separate from its annotation (i.e. how the field appears on the page). After you modify Field's value you need to refresh Field's appearance as follows:

field.SetValue("My value");

// Regenerate appearance stream.
field.RefreshAppearance();

Alternatively, you can delete "AP" entry from the Widget annotation and set "NeedAppearances" flag in AcroForm dictionary:

doc.GetAcroForm()
   .PutBool("NeedAppearances", true);

This will force viewer application to auto-generate new field appearances every time the document is opened.

Yet another option is to generate a custom annotation appearance using ElementBuilder and ElementWriter and then set the "AP" entry in the widget dictionary to the new appearance stream. This functionality is useful in applications that need very advanced control over 'look and feel' of the document.

+Does PDFNet support form flattening?

Form 'flattening' refers to the operation that changes active form fields into a static area that is part of the PDF document, just like the other text and images in the document. A completely flattened PDF form does not have any widget annotations or interactive fields.

Using Field.Flatten() or Page.FlattenField() method it is possible to merge individual field appearances with the page content. PDFNet also allows you to flatten all forms in the document in a single function call (PDFDoc.FlattenFields()).

Note that it is not possible to undo Field.Flatten() operation. An alternative approach to set the field as read only, that can be programmatically reversed, is using Field.SetFlag(Field::e_read_only, true) method.

+How do I remove all JavaScript from a document/form?

You can use the following code snippet to remove all JavaScript from the document:

FieldIterator itr = doc.GetFieldIterator();
for( ; itr.HasNext(); itr.Next()) {
  Obj dict = itr.Current().GetSDFObj();
  dict.Erase("A");
  dict.Erase("AA");
}

Printing

+How do I print a document?

It is possible to use PDFNet printing functionality in both client and server applications.

For an example of client integration, please take a look at PDFView sample project (PrintPage method).

If you are interested in server-side printing, please take a look at PDFPrint sample project. PDFPrint sample does not require any user intervention and can automatically print on the default printer.

SDF

+What do "SDF" and "COS" stand for?

SDF (Structured Document Format) and COS (Carousel Object System; Carousel was a codename for Acrobat 1.0) are synonyms for PDF low-level object model. SDF is the acronym used in PDFNet, whereas COS is used in Acrobat SDK.

In many ways, SDF is to PDF what XML is to SVG (Scalable Vector Graphics). Cos object system provides the low-level object types and file structure used in PDF files. PDF documents are graphs of Cos objects. Cos objects can represent document components such as bookmarks, pages, fonts, and annotations, etc.

PDF is not the only document format built on top of SDF/Cos. FDF (Form Data Format) and PJTF (Portable Job Ticket Format) are also built on top of Cos.

The SDF/Cos layer deals directly with the data that is in a PDF (or Cos based) file. The data types are referred to as SDF/Cos Objects. There are eight data types found in PDF files. They are arrays, dictionaries, numbers, Boolean values, names, strings, streams, and a null object. In order to retrieve or modify PDF (or other Cos based) content, you need to understand these objects. You can create new objects and delete or modify existing objects.

For a detailed description of Cos layer refer to the Chapter 3 (Syntax) of PDF Reference Manual.

Security

+How do I remove PDF security?

Simply use pdfdoc.RemoveSecurity(), than save the document using pdfdoc.Save(...).

Next Steps:
  • Download Trial
  • Purchase

See Licensing Options

Sub Navigation
  • 2011 PDFTRON SYSTEMS, INC, ALL RIGHTS RESERVED |
  • LEGAL |
  • SITEMAP |
  • CAREERS |
  • CONTACT US