Some test text!

menu
search
Pixel by pixelkeyboard_arrow_down

Custom Diffing using pixel by pixel comparison

If you would prefer to implement your own diffing algorithm, we provide APIs to retrieve image data from documents. You can use these images to compare the pixels between two documents.

The setup is similar to the previous example, except this time we don't need to enable the full API. We can rewrite our getDocument function to look like this:

Please note that the following code snippets are very generic and assume both documents are the same size and have the same amount of pages. Please make sure you handle these cases yourself if you plan to implement this into your own project.

See this link for a working sample.
const getDocument = (url) => {
  return new Promise(async resolve => {
    const newDoc = new CoreControls.Document('my-file-name', 'pdf');
    const backendType = await CoreControls.getDefaultBackendType();
    const options = {
      workerTransportPromise: CoreControls.initPDFWorkerTransports(backendType, {}/* , license key here */),
      extension: 'pdf',
    };
    const partRetriever = new CoreControls.PartRetrievers.ExternalPdfPartRetriever(url);
    newDoc.loadAsync(partRetriever, async (err) => {
      resolve(newDoc); // removed  the `getPDFDoc` function here
    }, options);
  })
};

const [doc1, doc2] = await Promise.all([
  getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_1.pdf'),
  getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_2.pdf')
])

Now we can write a function to get image data from these documents.

const getImageData = (doc, pageIndex = 0) => {
  return new Promise(resolve => {
    doc.loadCanvasAsync({
      pageIndex,
      drawComplete: (pageCanvas) => {
        const ctx = pageCanvas.getContext('2d');
        const imageData = ctx.getImageData(0, 0, pageCanvas.width, pageCanvas.height);
        resolve(imageData);
      }
    })
  })
}

// get image data for the first page of both our documents
const [imageData1, imageData2] = await Promise.all([
  getImageData(doc1, 0),
  getImageData(doc2, 0)
]);

Now, we can loop over these pixels, and compare them however we wish.

// Get the actual pixels from the ImageData object
const pixelData1 = imageData1.data;
const pixelData2 = imageData2.data;

const newImageData = new Uint8ClampedArray(pixelData1.length);

for(let i = 0; i < imageData1.length; i += 4) {
  // rgba values for each pixel in imageData1 (document 1)
  const r1 = pixelData1[i];
  const g1 = pixelData1[i + 1];
  const b1 = pixelData1[i + 2];
  const a1 = pixelData1[i + 3];

  // rgba values for each pixel in imageData2 (document 2)
  const r2 = pixelData2[i];
  const g2 = pixelData2[i + 1];
  const b2 = pixelData2[i + 2];
  const a2 = pixelData2[i + 3];

  // Implement your own diffing algorithm here
  newImageData[i] = someDiffFunction(r1, r2);
  newImageData[i+1] = someDiffFunction(g1, g2);
  newImageData[i+2] = someDiffFunction(b1, b2);
  newImageData[i+3] = someDiffFunction(a1, a2);
}

// Here you could create a new canvas with your diffed pixels,
// and open it with webviewer
const canvas = document.createElement('canvas');
canvas.width = imageData1.width;
canvas.height = imageData1.height;
canvas.getContext('2d').putImageData(new ImageData(newImageData, imageData1.width), 0 , 0);
canvas.toBlob((blob) => {
  readerControl.loadDocument(blob, { filename: 'image.png' });
})

A demo of a similar approach can be found here.

Get the answers you need: Support

close

Free Trial

Get unlimited trial usage of PDFTron SDK to bring accurate, reliable, and fast document processing capabilities to any application or workflow.

Select a platform to get started with your free trial.

Unlimited usage. No email address required.

PDFTron Receives USD$71 Million Growth Investment Led By Silversmith Capital Partners

Learn more
close