Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
6d202c5
Add `DocumentLoader` ABC and `RemoteDocument` `TypedDict`
anatoly-scherbakov Apr 26, 2026
166d527
Add `FrozenDocumentLoader` and `BUNDLED_CONTEXTS`
anatoly-scherbakov Apr 26, 2026
8bcc516
Vendor `https://www.w3.org/ns/activitystreams` context
anatoly-scherbakov Apr 26, 2026
a56be1f
Vendor `https://www.w3.org/ns/did/v1` context
anatoly-scherbakov Apr 26, 2026
4d278b0
Vendor `https://www.w3.org/2018/credentials/v1` context
anatoly-scherbakov Apr 26, 2026
ea8c083
Vendor `https://www.w3.org/ns/credentials/v2` context
anatoly-scherbakov Apr 26, 2026
fcd3889
Vendor `https://w3id.org/security/v1` context
anatoly-scherbakov Apr 26, 2026
d491d0e
Vendor `https://w3id.org/security/v2` context
anatoly-scherbakov Apr 26, 2026
de4e3ae
Vendor `https://w3id.org/security/suites/ed25519-2020/v1` context
anatoly-scherbakov Apr 26, 2026
6d70d58
Vendor `https://w3id.org/security/suites/jws-2020/v1` context
anatoly-scherbakov Apr 26, 2026
646dc13
Add `scripts/download_contexts.py` to refresh bundled contexts
anatoly-scherbakov Apr 26, 2026
dd782e1
Add `download-bundled-contexts` Makefile target
anatoly-scherbakov Apr 26, 2026
f5c543f
Add tests for `FrozenDocumentLoader` and `BUNDLED_CONTEXTS`
anatoly-scherbakov Apr 26, 2026
694ab5b
Re-export `DocumentLoader`, `FrozenDocumentLoader`, `BUNDLED_CONTEXTS…
anatoly-scherbakov Apr 26, 2026
5416db4
Ship bundled `*.jsonld` contexts in `pyld.documentloader.frozen` package
anatoly-scherbakov Apr 26, 2026
eb84a1d
Include bundled `*.jsonld` contexts in source distributions
anatoly-scherbakov Apr 26, 2026
128b404
Document `FrozenDocumentLoader` and `BUNDLED_CONTEXTS` in README
anatoly-scherbakov Apr 26, 2026
5b9aae9
Add Sphinx autoclass entries for `DocumentLoader` and `FrozenDocument…
anatoly-scherbakov Apr 26, 2026
a99447f
Note `FrozenDocumentLoader` and `BUNDLED_CONTEXTS` in 3.1.0 changelog
anatoly-scherbakov Apr 26, 2026
005e0f8
Replace FrozenDocumentLoader dataclass initialization
anatoly-scherbakov May 4, 2026
ffc27a7
Add FrozenDocumentLoader test docstrings
anatoly-scherbakov May 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
# pyld ChangeLog

## 3.1.0 - unreleased

### Added
- `pyld.DocumentLoader` abstract base class for class-based document loaders,
with a `RemoteDocument` `TypedDict` describing the expected return shape.
- `pyld.FrozenDocumentLoader`: a class-based loader that serves only URLs in
its `documents` allowlist and refuses everything else with
`JsonLdError(code='loading document failed')`. Instantiating with no
arguments serves the curated `pyld.BUNDLED_CONTEXTS` set; instantiating
with `dict(BUNDLED_CONTEXTS, **extras)` extends the bundle. Suitable for
air-gapped, reproducible-build, and security-hardened deployments.
- `pyld.BUNDLED_CONTEXTS`: curated mapping of high-traffic public W3C / W3ID
JSON-LD contexts (ActivityStreams, DID v1, VC v1/v2, Linked Data Security
v1/v2, Ed25519-2020, JWS-2020) to vendored on-disk copies. Refresh with
`make download-bundled-contexts`.

## 3.0.0 - 2026-04-02

### Changed
Expand Down
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
include README.rst README.txt LICENSE CHANGELOG.md
recursive-include lib/pyld/documentloader/frozen/bundled *.jsonld
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: install test upgrade-submodules
.PHONY: install test upgrade-submodules download-bundled-contexts

install:
pip install -e .
Expand All @@ -9,6 +9,9 @@ test:
upgrade-submodules:
git submodule update --remote --init --recursive

download-bundled-contexts:
python scripts/download_contexts.py

RUFF_TARGET = lib/pyld/*.py tests/*.py

lint:
Expand Down
29 changes: 29 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,35 @@ If Requests_ is not available, the loader is set to aiohttp_. The fallback
document loader is a dummy document loader that raises an exception on every
invocation.

Frozen Document Loader
~~~~~~~~~~~~~~~~~~~~~~

For air-gapped runs, reproducible builds, and security-hardened deployments
that must not perform any remote context fetches at all, PyLD ships
``FrozenDocumentLoader``: a class-based loader that serves only the URLs in
its ``documents`` allowlist and refuses everything else with
``JsonLdError(code='loading document failed')``.

Instantiating with no arguments serves the curated ``BUNDLED_CONTEXTS`` set
(ActivityStreams, DID v1, Verifiable Credentials v1 and v2, Linked Data
Security v1/v2, Ed25519-2020, and JWS-2020). To extend the bundle with
additional pre-vetted contexts, pass a merged mapping:

.. code-block:: Python

from pyld import jsonld, FrozenDocumentLoader, BUNDLED_CONTEXTS

loader = FrozenDocumentLoader(documents=dict(
BUNDLED_CONTEXTS,
**{'https://example.com/my-ctx': Path('contexts/my-ctx.jsonld')},
))
jsonld.expand(doc, options={'documentLoader': loader})

This honors the W3C *JSON-LD Best Practices* recommendation that clients
SHOULD attempt to use a locally cached version of contexts (see
`§ Cache JSON-LD Contexts <https://w3c.github.io/json-ld-bp/#cache-json-ld-contexts>`_).
Refresh the bundled copies with ``make download-bundled-contexts``.

Handling ignored properties during JSON-LD expansion
----------------------------------------------------

Expand Down
7 changes: 7 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,13 @@ API Reference
.. autofunction:: frame
.. autofunction:: normalize

.. module:: pyld
.. autoclass:: DocumentLoader
:members:
.. autoclass:: FrozenDocumentLoader
:members:
.. autodata:: BUNDLED_CONTEXTS

Indices and tables
------------------

Expand Down
11 changes: 10 additions & 1 deletion lib/pyld/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,14 @@

from . import jsonld
from .context_resolver import ContextResolver
from .documentloader.base import DocumentLoader, RemoteDocument
from .documentloader.frozen import BUNDLED_CONTEXTS, FrozenDocumentLoader

__all__ = ['jsonld', 'ContextResolver']
__all__ = [
'BUNDLED_CONTEXTS',
'ContextResolver',
'DocumentLoader',
'FrozenDocumentLoader',
'RemoteDocument',
'jsonld',
]
44 changes: 44 additions & 0 deletions lib/pyld/documentloader/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
"""
Abstract base class for class-based JSON-LD document loaders.

.. module:: jsonld.documentloader.base
:synopsis: DocumentLoader abstract base class
"""

from abc import ABC, abstractmethod
from typing import Any, TypedDict


class RemoteDocument(TypedDict):
"""Shape returned by a JSON-LD document loader.

Mirrors the *RemoteDocument* structure defined in the W3C JSON-LD 1.1 API
(https://www.w3.org/TR/json-ld11-api/#remotedocument).
"""

contentType: str
contextUrl: str | None
documentUrl: str
document: Any


class DocumentLoader(ABC):
Comment thread
mielvds marked this conversation as resolved.
"""Abstract base class for class-based JSON-LD document loaders.

Concrete subclasses implement :meth:`__call__` to fetch a document for a
given URL and return a :class:`RemoteDocument`.

Existing function-based loaders (:func:`pyld.jsonld.requests_document_loader`,
:func:`pyld.jsonld.aiohttp_document_loader`) remain valid: pyld's loader
contract is "any callable with the right signature". This ABC is for new
class-based loaders only.
"""

@abstractmethod
def __call__(self, url: str, options: dict) -> RemoteDocument:
"""Retrieve the JSON-LD document at ``url``.

:param url: the URL to retrieve.
:param options: loader options (e.g. ``headers``).
:return: a :class:`RemoteDocument`.
"""
83 changes: 83 additions & 0 deletions lib/pyld/documentloader/frozen/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
"""
Frozen JSON-LD document loader.

A document loader that serves *only* the URLs in its ``documents`` allowlist
and refuses everything else with :class:`pyld.jsonld.JsonLdError`. Suitable for
secure / air-gapped / privacy-sensitive deployments and for honoring the
guidance in the W3C *JSON-LD Best Practices* note that clients SHOULD attempt
to use a locally cached version of contexts (§ Cache JSON-LD Contexts,
https://w3c.github.io/json-ld-bp/#cache-json-ld-contexts).

This module also defines :data:`BUNDLED_CONTEXTS`, a curated mapping of
high-traffic public W3C / W3ID JSON-LD context URLs to vendored on-disk copies
shipped with the package. See ``scripts/download_contexts.py`` for how the
files in ``bundled/`` are refreshed.

.. module:: jsonld.documentloader.frozen
:synopsis: FrozenDocumentLoader and BUNDLED_CONTEXTS
"""

import json
from collections.abc import Mapping
from pathlib import Path

from pyld.documentloader.base import DocumentLoader, RemoteDocument
from pyld.jsonld import JsonLdError

_BUNDLED_DIR = Path(__file__).parent / 'bundled'


BUNDLED_CONTEXTS: Mapping[str, Path] = {
'https://www.w3.org/ns/activitystreams': _BUNDLED_DIR / 'activitystreams.jsonld',
'https://www.w3.org/ns/did/v1': _BUNDLED_DIR / 'did-v1.jsonld',
'https://www.w3.org/2018/credentials/v1': _BUNDLED_DIR / 'credentials-v1.jsonld',
'https://www.w3.org/ns/credentials/v2': _BUNDLED_DIR / 'credentials-v2.jsonld',
'https://w3id.org/security/v1': _BUNDLED_DIR / 'security-v1.jsonld',
'https://w3id.org/security/v2': _BUNDLED_DIR / 'security-v2.jsonld',
'https://w3id.org/security/suites/ed25519-2020/v1': _BUNDLED_DIR
/ 'security-ed25519-2020-v1.jsonld',
'https://w3id.org/security/suites/jws-2020/v1': _BUNDLED_DIR
/ 'security-jws-2020-v1.jsonld',
}


class FrozenDocumentLoader(DocumentLoader):
"""Document loader that serves only a sealed allowlist of URLs.

``documents`` maps each allowed URL to either a parsed JSON-LD ``dict`` or
a :class:`pathlib.Path` pointing to a JSON file on disk. Path entries are
read and parsed lazily on first request, then cached in place so subsequent
calls skip the file read. Any URL not present in the mapping raises
:class:`pyld.jsonld.JsonLdError` with code ``'loading document failed'``.

With no arguments, a ``FrozenDocumentLoader`` serves the curated
:data:`BUNDLED_CONTEXTS` set. To extend rather than replace the bundle::

FrozenDocumentLoader(documents=dict(BUNDLED_CONTEXTS, **extras))
"""

def __init__(self, documents: Mapping[str, dict | Path] | None = None) -> None:
# Take ownership of the mapping so we can cache parsed Paths in place
# without mutating the caller's dict.
if documents is None:
documents = BUNDLED_CONTEXTS
self.documents = dict(documents)

def __call__(self, url: str, options: dict) -> RemoteDocument:
if url not in self.documents:
raise JsonLdError(
'Refusing to load document outside the allowed set.',
'jsonld.LoadDocumentError',
{'url': url},
code='loading document failed',
)
value: dict | Path = self.documents[url]
if isinstance(value, Path):
value = json.loads(value.read_text(encoding='utf-8'))
self.documents[url] = value
return {
'contentType': 'application/ld+json',
'contextUrl': None,
'documentUrl': url,
'document': value,
}
Loading
Loading