Some test text!

Diffing

Contents

Diffing API
Custom Diffing

WebViewer has the ability to take two PDF files and output the visual difference between them. This can be useful in situations where you want to visually see the difference between two versions of a document (a blueprint for example). A live demo can be viewed here.

linkDiffing API

The full API is required for this functionality. Click here for more details.
The following code snippets are written using ES6+ syntax, which will only work in modern browsers. You may, however, transpile this code down to ES5 to ensure proper browser support. See this guide for more details.

In our config.js file (see this guide for more information on config files), we start by waiting for WebViewer to fully initialize by waiting for the viewerLoaded event to fire. Once this is done, we can initialize the full API and get the documents into memory.

We'll start by writing a function that takes a URL and resolves with a document, and then use that function to load two sample documents.

$(document).on('viewerLoaded', async () => {
  // initialize PDFNet
  await PDFNet.initialize();

  const getDocument = (url) => {
    return new Promise(async resolve => {
      const newDoc = new CoreControls.Document('my-file-name', 'pdf');
      const backendType = await CoreControls.getDefaultBackendType();
      const options = {
        workerTransportPromise: CoreControls.initPDFWorkerTransports(backendType, {}/* , license key here */),
        extension: 'pdf',
      };
      const partRetriever = new CoreControls.PartRetrievers.ExternalPdfPartRetriever(url);
      newDoc.loadAsync(partRetriever, async (err) => {
        // Full API is required for this function to work.
        resolve(await newDoc.getPDFDoc());
      }, options);
    })
  };

  const [doc1, doc2] = await Promise.all([
    getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_1.pdf'),
    getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_2.pdf')
  ])
});

Now we need to get the pages that we want to diff. In this example, we will diff all pages. We'll write a helper function to help us get the pages into an array, and then use that function to get the pages for both our documents.

// inside `viewerLoaded`
const getPageArray = async (doc) => {
  const arr = [];
  const itr = await doc.getPageIterator(1);

  for (itr; await itr.hasNext(); itr.next()) {
    const page = await itr.current();
    arr.push(page);
  }

  return arr;
}

const [doc1Pages, doc2Pages] = await Promise.all([
  getPageArray(doc1),
  getPageArray(doc2)
]);

Now we can create a new blank document, and fill it with the diffed images from our two documents. Once that is done, we can tell WebViewer to display this new diffed document.

// inside `viewerLoaded`
const newDoc = await PDFNet.PDFDoc.create();
newDoc.lock();

// we'll loop over the doc with the most pages
const biggestLength = Math.max(doc1Pages.length, doc2Pages.length)

// we need to do the pages in order, so lets create a Promise chain
const chain = Promise.resolve();

for(let i = 0; i < biggestLength; i++) {
  chain.then(async () => {
    const page1 = doc1Pages[i];
    const page2 = doc2Pages[i];

    // handle the case where one document has more pages than the other
    if (!page1) {
      page1 = new PDFNet.Page(0); // create a blank page
    } 
    if (!page2) {
      page2 = new PDFNet.Page(0); // create a blank page
    }
    return newDoc.appendVisualDiff(page1, page2, null)
  })
}

await chain; // wait for our chain to resolve
newDoc.unlock(); 

// display the document!
// readerControl is a global variable thats automatically defined inside the config file.
readerControl.loadDocument(newDoc);

WebViewer should now display the diffed document, like the image below.

In this example, the color blue represents content that is in document one and not document two, and red represents content in document two that is not in document one. Black represents overlap.

Behind the scenes, WebViewer blends the two documents using the Porter/Duff 'darken' operator and displays the output.

linkCustom Diffing

If you would prefer to implement your own diffing algorithm, we provide APIs to retrieve image data from documents. You can use these images to compare the pixels between two documents.

The setup is similar to the previous example, except this time we don't need to enable the full API. We can rewrite our getDocument function to look like this:

Please note that the following code snippets are very generic and assumes both documents are the same size and have the same amount of pages. Please make sure you handle these cases yourself if you plan to implement this into your own project.
const getDocument = (url) => {
  return new Promise(async resolve => {
    const newDoc = new CoreControls.Document('my-file-name', 'pdf');
    const backendType = await CoreControls.getDefaultBackendType();
    const options = {
      workerTransportPromise: CoreControls.initPDFWorkerTransports(backendType, {}/* , license key here */),
      extension: 'pdf',
    };
    const partRetriever = new CoreControls.PartRetrievers.ExternalPdfPartRetriever(url);
    newDoc.loadAsync(partRetriever, async (err) => {
      resolve(newDoc); // removed  the `getPDFDoc` function here
    }, options);
  })
};

const [doc1, doc2] = await Promise.all([
  getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_1.pdf'),
  getDocument('https://s3.amazonaws.com/pdftron/pdftron/example/test_doc_2.pdf')
])

Now we can write a function to get image data from these documents.

const getImageData = (doc, pageIndex = 0) => {
  return new Promise(resolve => {
    doc.loadCanvasAsync({
      pageIndex,
      drawComplete: (pageCanvas) => {
        const ctx = pageCanvas.getContext('2d');
        const imageData = ctx.getImageData(0, 0, pageCanvas.width, pageCanvas.height);
        resolve(imageData);
      }
    })
  })
}

// get image data for the first page of both our documents
const [imageData1, imageData2] = await Promise.all([
  getImageData(doc1, 0),
  getImageData(doc2, 0)
]);

Now, we can loop over these pixels, and compare them however we wish.

// Get the actual pixels from the ImageData object
const pixelData1 = imageData1.data;
const pixelData2 = imageData2.data;

const newImageData = new Uint8ClampedArray(pixelData1.length);

for(let i = 0; i < imageData1.length; i += 4) {
  // rgba values for each pixel in imageData1 (document 1)
  const r1 = pixelData1[i];
  const g1 = pixelData1[i + 1];
  const b1 = pixelData1[i + 2];
  const a1 = pixelData1[i + 3];

  // rgba values for each pixel in imageData2 (document 2)
  const r2 = pixelData2[i];
  const g2 = pixelData2[i + 1];
  const b2 = pixelData2[i + 2];
  const a2 = pixelData2[i + 3];

  // Implement your own diffing algorithm here
  newImageData[i] = someDiffFunction(r1, r2);
  newImageData[i+1] = someDiffFunction(g1, g2);
  newImageData[i+2] = someDiffFunction(b1, b2);
  newImageData[i+3] = someDiffFunction(a1, a2);
}

// Here you could create a new canvas with your diffed pixels,
// and open it with webviewer
const canvas = document.createElement('canvas');
canvas.width = imageData1.width;
canvas.height = imageData1.height;
canvas.getContext('2d').putImageData(new ImageData(newImageData, imageData1.width), 0 , 0);
canvas.toBlob((blob) => {
  readerControl.loadDocument(blob, { filename: 'image.png' });
})

A demo of a similar approach can be found here.. You can also find the source code for the demo here.

Get the answers you need: Support

Contents

Diffing API
Custom Diffing