| title | Reference Store |
|---|---|
| section | Features |
| order | 1 |
| description | Cache and reuse assets |
texd has the ability to reuse previously sent material. This allows you to reduce the amount of data you need to transmit with each render request. Here is a back-of-the-envelope calculation:
- If you want to generate 1000 documents, each including a font with 400 kB in size, and a logo file with 100 kB in size, you will need to transmit 500 MB of the same two files in total.
- If you can reuse those two assets, you would only need to transmit them once, and use a reference hash for each subsequent request. The total then reduces 1×500 kB (complete assets for the first request) + 999×100 Byte (50 Byte per reference hash for subsequent requests) = 599.9 kB.
The feature in texd parlance is called "reference store", and you may think of it as a cache. It saves files server-side (e.g. on disk) and retrieves them on-demand, if you request such a file reference.
A reference hash is simply the Base64-encoded SHA256 checksum of the file contents, prefixed with "sha256:". (Canonically, we use the URL-safe alphabet without padding for the Base64 encoder, but texd also accepts the standard alphabet, and padding characters are ignored in both cases.)
To use a file reference, you need to set a special content type in the request, and include the
reference hash instead of the file contents. The content type must be application/x.texd; ref=use.
The resulting HTTP request should then look something like this:
POST /render HTTP/1.1
Content-Type: multipart/form-data; boundary=boundary
--boundary
Content-Disposition: form-data; name=input.tex; filename=input.tex
Content-Type: application/octet-stream
[content of input.tex omitted]
--boundary
Content-Disposition: form-data; name=logo.pdf; filename=logo.pdf
Content-Type: application/x.texd; ref=use
sha256:p5w-x0VQUh2kXyYbbv1ubkc-oZ0z7aZYNjSKVVzaZuo=
--boundary--For unknown reference hashes, texd will respond with an error, and list all unknown references:
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/json
{
"category": "reference",
"error": "unknown file references",
"reference": [
"sha256:p5w-x0VQUh2kXyYbbv1ubkc-oZ0z7aZYNjSKVVzaZuo="
]
}In such a case, you can repeat your HTTP request, and change the ref=use to ref=store for
matching documents:
POST /render HTTP/1.1
Content-Type: multipart/form-data; boundary=boundary
--boundary
Content-Disposition: form-data; name=input.tex; filename=input.tex
Content-Type: application/octet-stream
[content of input.tex omitted]
--boundary
Content-Disposition: form-data; name=logo.pdf; filename=logo.pdf
Content-Type: application/x.texd; ref=store
[content of logo.pdf omitted]
--boundary--By default, the reference store is not enabled. You must enable it explicitly, by providing
a command line flag. Assuming you have a local directory ./refs, you instruct texd to use
this directory for references:
$ texd --reference-store=dir://./refsThe actual syntax is --reference-store=DSN, where storage adapters are identified through and
configured with a DSN (data source name, a URL). Currently there are only a handful of implementations:
-
The
dir://adapter (docs), which stores reference files on disk in a specified directory. Coincidentally, this adapter also provides an in-memory adapter (memory://), courtesy of the spf13/afero package. -
The
memcached://adapter (docs), which stores, you may have guessed it, reference files in a Memcached instance or cluster. -
The
nop://adapter (docs), which―for completeness―implements a no-op store (i.e. attempts to store reference file into is, or load files from it fail silently). This adapter is used as fallback if you don't configure any other adapter.
It is not unfeasible to imagine further adapters being available in the future, such as additional
key/value stores (redis://), object storages (s3://, minio://), or even RDBMS (postgresql://,
mariadb://).
texd supports three different retention policies:
keep(ornone) will keep all file references forever. This is the default setting.purge-on-start(or justpurge) will delete file references once on startup.accesswill keep an access list with LRU semantics, and delete file references, either if a max. number of items is reached, or if the total size of items exceeds a threshold, or both.
To select a specific retention policy, use the --retention-policy CLI flag:
$ texd --reference-store=dir://./refs --retention-policy=purgeTo configure the access list (--retention-policy=access), you can adjust the quota to your needs:
$ texd --reference-store=dir://./refs \
--retention-policy=access \
--rp-access-items=1000 \
--rp-access-size=100MB
Notes:
- The default quota for the max. number of items (
--rp-access-items) is 1000. - The default quota for the max. total file size (
--rp-access-size) is 100MB. - Total file size is measured in bytes, common suffixes (100KB, 2MiB, 1.3GB) work as expected.
- To disable either limit, set the value to 0 (e.g.
--rp-access-items=0). - It is an error to disable both limits (in this case just use
--retention-policy=keep). - Currently, only the
dir://(andmemory://) adapter support a retention policy; thememcached://adapter delegates this responsibility to the Memcached server.