Some test text!

C# PDF Extraction Library for .NET
C# PDF Extraction Library for .NET

Sample C# .NET code for using PDFTron SDK to convert Word documents to PDF without any external dependencies or Microsoft Office licenses. Perform .docx to PDF .NET conversion on a Linux or Windows server to automate Word-centric workflows, or entirely in the user's client (web browser, mobile device). Combine the .docx to PDF .NET conversion functionality with our Viewer to display or annotate Word files on all major platforms, including Web, Android, iOS, Xamarin, UWP, and Windows.

Documentation

Guides

FAQ

API

Samples

Get Started

Extract text from a PDF

To extract text from a PDF document.

Extract embedded fonts from a PDF

To extract embedded fonts from a PDF document.

Benefits / Features

  • Extract digital signatures (timestamps, etc.)
  • Intuitive page content extraction based on a concept of graphical elements
  • High-quality and efficient text recognition engine (pdftron.PDF.TextExtractor). Use TextExtractor to extract structured Unicode text including style and positioning information from any PDF document. The API is simple to use and has advanced options related to hidden or duplicated text, ligature expansion, etc.
  • Low-level text extraction (including positioning information for text runs and individual characters)
  • Complete access to the graphics state (for color spaces and colorants, dash properties, etc.)
  • Full access to fonts, including glyph outlines
  • Image extraction. All compression filters allowed in PDF are supported and images are optionally extracted in RAW format
  • Image color-conversion and normalization filters
  • Full access to marked content (e.g., used in tagged PDF documents to preserve logical structure or to mark transparency groups)
  • Full access to page form fields and annotations
  • Extraction of embedded fonts, ICC color profiles, U3D streams, embedded files, etc.
  • Access to a document's metadata
  • High-level Logical Structure API and support for 'Tagged' PDF documents
  • Extract and render PDF layers (also known as Optional Content Groups, or OCGs)

Tools and Utilities

PDF2Text

A utility for text extraction from PDF documents.

PDF2Text

PDFTron.AI

A tool for text and table data extraction.

PDFTron.AI

Try our SDK for free today