Skip to content

Implement disk-based cache for remote documents #253

@anatoly-scherbakov

Description

@anatoly-scherbakov

W3C JSON-LD Best Practices §Cache JSON-LD Contexts:

Services providing a JSON-LD Context SHOULD set HTTP cache-control headers to allow liberal caching of such contexts, and clients SHOULD attempt to use a locally cached version of these documents.

The bundled function loaders re-fetch every URL on every call, and _resolved_context_cache is in-process only — so short-lived processes (CLIs, workers, serverless) re-download every context on every run. See #70.

Proposal

Add a DocumentLoader subclass (the ABC introduced in #250) that persists fetched documents to a configurable on-disk cache directory and honors Cache-Control / Expires / ETag / Last-Modified for revalidation.

@dataclass
class CachingDocumentLoader(DocumentLoader):
    cache_dir: Path
    default_max_age: int | None = None

Opt-in; existing loaders unchanged. Async sibling can follow.

Related: #70, #85, #250.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions