Some test text!

menu
Text searchkeyboard_arrow_down

Search for text in a PDF in Ruby

To search for text in a PDF using regular expression and then apply a link annotation on the highlighted result.

In this example, we add a link annotation but any other types of annotations can be applied here such as redaction annotations in the case of a search and redact workflow.
doc = PDFDoc.new(filename)
txt_search = TextSearch.new
mode = TextSearch::E_whole_word | TextSearch::E_page_stop
pattern = ""

# use regular expression to find credit card number
mode |= TextSearch::E_reg_expression | TextSearch::E_highlight
txt_search.SetMode(mode)
pattern = "\\d{4}-\\d{4}-\\d{4}-\\d{4}"  #or "(\\d{4}-){3}\\d{4}"
txt_search.SetPattern(pattern)

# call Begin method to initialize the text search.
txt_search.Begin(doc, pattern, mode)
searchResult = txt_search.Run

if searchResult.IsFound
  # add a link annotation based on the location of the found instance
  hlts = searchResult.GetHighlights
  hlts.Begin(doc)
  
  while hlts.HasNext do
    cur_page = doc.GetPage(hlts.GetCurrentPageNumber)
    quadsInfo = hlts.GetCurrentQuads

    i = 0
    while i < quadsInfo.size do
      q = quadsInfo[i]
      # assume each quad is an axis-aligned rectangle                       
      x1 = [q.p1.x, q.p2.x, q.p3.x, q.p4.x].min
      x2 = [q.p1.x, q.p2.x, q.p3.x, q.p4.x].max
      y1 = [q.p1.y, q.p2.y, q.p3.y, q.p4.y].min
      y2 = [q.p1.y, q.p2.y, q.p3.y, q.p4.y].max
      hyper_link = Link.Create(doc.GetSDFDoc, Rect.new(x1, y1, x2, y2), Action.CreateURI(doc.GetSDFDoc, "http://www.pdftron.com"))
      cur_page.AnnotPushBack(hyper_link)
      i = i + 1
    end         
    hlts.Next
  end
end

Search PDF files for text
Full code sample which shows how to use TextSearch to search text on PDF pages using regular expressions.

Get the answers you need: Support

close

Free Trial

Get unlimited trial usage of PDFTron SDK to bring accurate, reliable, and fast document processing capabilities to any application or workflow.

Select a platform to get started with your free trial.

Unlimited usage. No email address required.