Some test text!

menu
Hardware and Networkskeyboard_arrow_down

Hardware and network setup

linkHardware configuration

WebViewer Server manages its own work queues and caches a lot of intermediate data opportunistically locally, both in RAM and on disk. As such, having access to more system resources will allow the Server to operate more responsively and efficiently (using cached data improves response time, and also conserves CPU resources).

If your use case calls for multiple backend nodes, then a smaller number of more capable nodes is a better choice than a large umber of smaller nodes -- a 4 core/8GB server will have a higher peak user capacity than two 2 core/4GB servers.

linkMinimum hardware requirements

In order to maintain efficient operation, WebViewer server requires access to at least 2 CPU cores, at least 2GB of RAM, and 50GB of storage space. Anything less than 2 cores, and internal work queues will start to behave serially, which will drastically raise server response times.

Access to insufficient RAM limits the amount of data that can be held in the short-term cache, and it also limits the ability of the server to process particularly difficult documents.

If there is insufficient storage space, then the server will be unable to generate data without first pushing out existing cached data that is still in use by clients.

linkNetwork configuration

linkSSL Config

The WebViewer server comes with a self-signed certificate, usable for SSL debugging purposes only.

In order to have SSL work correctly on your own domain you must provide a certificate chain file. This certificate chain file should:

  • Contain within it a public certificate, an optional intermediary certificate and a private key in the pem format.
  • The private key must not have an associated password.
Excluding the intermediary certificate
If you do not include an intermediary certificate in your certificate chain file, SSL may not work correctly for users on Firefox.

In order to use the SSL keys WebViewer Server must have them passed from the outside to the inside. We do this by mounting the key within the Docker container.

Once the key is prepared you should:

  1. Name the key combined_key.pem
  2. Create a directory and mount it to the ssl directory under loadbalancer in your docker-compose file like so:

    loadbalancer:
    volumes:
    - ./ha_ssl:/etc/default/ssl
  3. Place the keys in the external directory, in this case, ha_ssl.
  4. Restart the container.

linkSSL Config without HAProxy

We also allow you to use the SSL built in directly to Tomcat. This is useful if you wish to use WebViewer Server without its load balancer.

In order to use these keys to the container we must pass them from the outside to the inside of the container. We do this by mounting the keys within the Docker container.

  1. Name the public certificate cert.crt
  2. Name the private key private_key.pem
  3. Create a directory and mount it to the ssl directory under pdfd-tomcat in your docker-compose file as a volume:

    pdfd-tomcat:
    volumes:
    - ./tc_ssl:/usr/local/apache-tomcat-9.0.6/conf/ssl
  4. Place the keys in the external directory, in this case, tc_ssl.
  5. Restart the container.

linkAdding self signed SSL certificates to WebViewer Server for file servers

Your network may use self signed certificates in the file servers WebViewer Server will fetch files from. WebViewer Server will require those certificates in order to fetch files. You can add these certificates by placing your public certificates in a directory called external_certs at the same level as the docker-compose.yml, by adding it as a volume to pdfd-tomcat and rebuilding.

In order to use these certificates, we must pass them from outside the container to inside of it. We will mount a volume to do this:

  1. Create a directory called external_certs, in the root of the project.
  2. Place your self signed certificates into this folder.
  3. Mount this folder as a volume to the /certs directory:

    pdfd-tomcat:
    volumes:
    - ./external_certs:/certs

The certificates in this directory will be directly imported into WebViewer Server's Java certificates on first run.

linkScaling to multiple backend nodes with HAProxy

The container (along with webviewer) now has built-in support for using multiple backends behind a load balancer.

As the container is not entirely stateless, the balancer needs to fulfill a few requirements:

  • operates at layer 7 (http)
  • supports instance affinity ("stickiness") via cookies.
  • supports http health checks at a specific path

There is a sample configuration included in the download archive which demonstrates a fully working load balancer setup. Running docker-compose -f docker-compose_load_balance.yml up will launch a container composed of two WebViewer Server nodes with an HAProxy load balancer front end.

In the sample setup, incoming connections are directed to the least occupied backend node, and will remain attached to that node for the remainder of the session, or until the node starts to become overloaded.

If there are no available healthy nodes, then WebViewer will attempt to continue in client-only rendering mode.

linkSecurity

linkFetching files and Authorization

WebViewer Server does not handle authorization. If authentication is required for a file server, WebViewer Server needs to be passed it on a per request basis. We offer several options for passing authentication data to the WebViewer Server so it can fetch documents that require authorization.

In the WebViewer loadDocument call you are able to specify custom headers - these can contain things such as authorization tokens. When the WebViewer Server requests the URL specified in loadDocument, it will append these customHeaders.

WebViewer accepts signed links as an authorization method - the server will use these same links to successfully fetch files.

You may pass session cookies. This can be enabled as specified here, but only works when WebViewer Server and the file server in question share a domain.

In addition we have several options that allow users to better control the security of the WebViewer Server:

linkCaching on WebViewer Server

The cache on WebViewer Server works with a simple policy: if the link or name passed for the document is the same, the document cache is the same. This means, if you provide the url http://pdftron.com/mydoc.pdf, a cache will be created for this link. If you request this same link, you will recieve this cached item. If you change this link, the cache will now be remade regardless of the document being the same. Take for example these two links, they will return different cached items; http://pdftron.com/mydoc.pdf and http://pdftron.com/mydoc.pdf?revision=1. This is also the case when uploading files to WebViewer Server, however, these files will only cache based on the filename.

If you need to get around these cache limitations because of the nature of your file server, we provide several ways to do so.

linkUsing the on disk cache

The cache is controlled by two variables, TRN_MAX_CACHED_MB (MB) and TRN_MAX_CACHE_AGE_MINUTES (minutes). By default these values are both set to 0. In this default case the conditions for clearing data from the cache are as follows:

  • If cache is larger than the total disk size - 1 GB > clear
  • If cache items are younger than 30 minutes > do not clear

If the if either of the options or both are set, they would apply in order of precendence like so:

  • If cache is larger than the total disk size - 1 GB > clear
  • If cache size exceeds the TRN_MAX_CACHED_MB > clear
  • If cache item age exceeds TRN_MAX_CACHE_AGE_MINUTES > clear
  • If cache items are younger than 30 minutes -> do not clear

When we delete cache items, we start with the oldest first and will continue deleting items until the necessary conditions have been met or we hit this 30 minute item age limit.

In summary:

TRN_MAX_CACHED_MB has a minimum of 1 GB and a maximum of the disk size minus 1 GB. TRN_MAX_CACHE_AGE_MINUTES has a minimum of 30 minutes and no maximum. Ensuring the disk does not completely fill holds precedence over all.

linkUsing customHeaders when caching

The first way to alter this behaviour is to use customHeaders which can be passed in your loadDocument code on WebViewer. All fields added to custom headers will be appended to the link when fetching the document, and not taken into consideration when creating the cache entry.

viewerInstance.loadDocument("http://pdftron.com/mydoc.pdf", { customHeaders: {
  revisionId: '1234'
  documentId: '12345'
}});

// in this instance, the fetched url would become:
//   http://pdftron.com/mydoc.pdf?revisionId=1234&documentId=12345
// but the cached link would become:
//   http://pdftron.com/mydoc.pdf

linkUsing cacheKey when caching

You may also choose to specify a cacheKey in the loadDocument call, when setting a cacheKey you are defining the name the item is cached against. This means, if you request a specific cacheKey, you will always recieve the cache for that specific cacheKey.

viewerInstance.loadDocument("http://pdftron.com/mydoc.pdf", { cacheKey: "revisionId123" });

// in this instance, the fetched url would become:
//   http://pdftron.com/mydoc.pdf
// but the cached link would become:
//   revisionId123

linkWebViewer Server interactions within a network

WebViewer Server was designed to work alongside the WebViewer client. Document requests are made through WebViewer to the server, the server then fetches the document requested , renders it, and returns the completed document links to WebViewer. The WebViewer client then fetches these documents directly from the server's /data directory. The diagram below depicts this process.

WebViewer Server Process

In addition, when WebViewer is working in conjunction with WebViewer Server, it will choose to use the fonts from the server instead of our publicly hosted fonts for WebViewer.

Outside of the file server and the WebViewer client, WebViewer Server has no interactions with other systems.

linkWebViewer Server in a distributed environment

A distributed environment constitutes something such as Kubernetes and the AWS Elastic Container Service. In these environments you may have more than one copy of the server running at once.

linkMaintaining user state across multiple servers

In a distributed environment WebViewer Server requires stability with connecting users. This is because the WebViewer Server container has a stateful cache. Whenever a user interacts with the server for a document conversation, they must continue communicating with the server they started the communication with until they request a new document. At this point, the user may be redirected to another server.

WebViewer Server achieves this in the AWS Auto-Scaling template with a HAProxy container that comes as part of the compose file. It manages user stickiness for each server, until their currently used server forces a reset of their stickiness cookie, which would occur when a new document is requested. The HAProxy configuration code here depicts how we handle the cookie settings. This can be found in haproxy/haproxy.cfg of our WebViewer Server package.

# balance mode, fill the available servers equally
balance leastconn
# haproxy will either use this cookie to select a backend, or will set it once one is chosen
# preserve means that it will leave it alone if the server sets it
cookie haproxid nocache insert preserve

# a server is healthy as long as /blackbox/health returns a 2xx or 3xx response
option httpchk get /blackbox/health
http-check expect rstatus (2|3)[0-9][0-9]

# keep sessions stuck, even to "unhealthy" servers, until the connection fails once
option persist
option redispatch 1

You may also run any sticky session solution you want with WebViewer Server, as long as it maintains a session with the server for the duration of the a document or client connection.

linkHealth Checks

You likely require a health check for your distributed environment. We offer one on your running WebViewer Server at http://your-address/blackbox/HealthCheck. You can learn more about it in our usage section.

linkCloud services

We detail some setup options for simple and distributed servers in the cloud in our guides on Azure and AWS

Get the answers you need: Support

close

Free Trial

Get unlimited trial usage of PDFTron SDK to bring accurate, reliable, and fast document processing capabilities to any application or workflow.

Select a platform to get started with your free trial.

Unlimited usage. No email address required.

Join our live demo to learn about use cases & capabilities for WebViewer

Learn more
close