feat: add RunConfig jinja rendering engine by eric-tramel · Pull Request #557 · NVIDIA-NeMo/DataDesigner

eric-tramel · 2026-04-17T14:13:44Z

Summary

This PR adds a RunConfig-level selector for engine-side Jinja rendering so users can choose between Data Designer's hardened renderer and the broader native Jinja2 sandbox. The public interface is JinjaRenderingEngine.SECURE versus JinjaRenderingEngine.NATIVE, with SECURE as the default so existing deployments do not silently lose template hardening. It also adds user-facing startup logs so create and preview show which Jinja mode is active.

Changes

Added

Add JinjaRenderingEngine to RunConfig and export it from data_designer.config
Add a new Security concept page that explains trusted versus untrusted deployment models, SECURE versus NATIVE, and the extra hardening provided by the secure renderer
Add startup log lines in packages/data-designer/src/data_designer/interface/data_designer.py that show the active Jinja mode with 🔒 / 🏠

Changed

Default RunConfig to JinjaRenderingEngine.SECURE, with NATIVE available as an explicit opt-in
Route the shared renderer seam in packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py through either the native Jinja sandbox or the existing hardened renderer
Align prompt rendering and sampling helpers so direct engine call sites inherit the same secure-by-default behavior as RunConfig
Rename the public secure mode from the internal codename to SECURE
Allow upper in the secure filter allowlist while keeping the explicit narrow filter policy in environment.py
Update docs/code_reference/run_config.md and docs/concepts/deployment-options.md to point users to the new security guidance
Refine docs/concepts/security.md with a compatibility matrix, implementation-backed rationale, filter guidance, and release-pinned source links

Fixed

Expose jsonpath in the native sandbox so switching from SECURE to NATIVE does not lose the Data Designer filter
Update engine tests that had been pinned to the temporary native-default behavior so they now reflect the secure default and explicit native opt-in

Usage

`SECURE` (default)

from data_designer import DataDesigner
import data_designer.config as dd

designer = DataDesigner()
designer.set_run_config(
    dd.RunConfig(
        jinja_rendering_engine=dd.JinjaRenderingEngine.SECURE,
    )
)

`NATIVE` (explicit opt-in)

from data_designer import DataDesigner
import data_designer.config as dd

designer = DataDesigner()
designer.set_run_config(
    dd.RunConfig(
        jinja_rendering_engine=dd.JinjaRenderingEngine.NATIVE,
    )
)

Startup Logs

[12:34:56] [INFO]   |-- 🔒 Jinja rendering engine: secure
[12:35:10] [INFO]   |-- 🏠 Jinja rendering engine: native

Attention Areas

Reviewers: Please pay special attention to the following:

packages/data-designer-config/src/data_designer/config/run_config.py - this is the public API and default-behavior change that affects existing deployments
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py - this remains the central renderer-selection seam and the secure filter policy implementation
packages/data-designer/src/data_designer/interface/data_designer.py - this is the user-facing logging change for create and preview
docs/concepts/security.md - this is the main user-facing explanation of why SECURE exists and when NATIVE is appropriate

Related Issues

Testing

uv run --all-packages ruff check packages/data-designer-config/src/data_designer/config/run_config.py packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py packages/data-designer-engine/src/data_designer/engine/column_generators/utils/prompt_renderer.py packages/data-designer-engine/src/data_designer/engine/sampling_gen/jinja_utils.py packages/data-designer-engine/src/data_designer/engine/sampling_gen/generator.py packages/data-designer-config/tests/config/test_run_config.py packages/data-designer-engine/tests/engine/column_generators/utils/test_prompt_renderer.py packages/data-designer-engine/tests/engine/sampling_gen/test_jinja_utils.py packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py packages/data-designer-engine/tests/engine/column_generators/generators/test_image.py docs/code_reference/run_config.md docs/concepts/security.md
uv run --all-packages pytest packages/data-designer-config/tests/config/test_run_config.py
uv run --all-packages pytest packages/data-designer-engine/tests
uv run --all-packages pytest packages/data-designer/tests/interface/test_data_designer.py -k "logs_secure_jinja_rendering_mode or logs_native_jinja_rendering_mode"
uv run --all-packages pytest packages/data-designer/tests/interface/test_data_designer.py
This file currently has an unrelated default-provider mismatch in the existing test fixture setup (brr-local vs stub-model-provider).
uv run --group docs mkdocs build --strict
Current repo-wide documentation warnings still cause strict mode to abort; the new Security page did not introduce additional strict-mode failures.

Checklist

Follows commit message conventions
Commits are signed off (DCO)
Architecture docs updated (if applicable): N/A

Description updated with AI

- add a RunConfig enum/field that selects native Jinja by default while preserving ginja as an opt-in hardened mode - route shared prompt and sampler rendering through the selected engine instead of hardcoding ginja behavior - cover the new selection path with config, engine, and docs updates Refs #87 Refs #550 Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

github-actions · 2026-04-17T14:14:46Z

Docs preview: https://5dad36cc.dd-docs-preview.pages.dev

Notebook tutorials are placeholder-only in previews.

- opt the ginja mixin regression tests into the hardened renderer explicitly now that RunConfig defaults engine rendering to native - update the image empty-prompt assertion to expect the native-mode ValueError surfaced by the generator Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

- rename the public RunConfig enum option from ginja to secure so the interface reads as native versus secure - update docs and tests to use the new public enum member and keep the hardened renderer wired through the same engine seam Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

andreatgretel · 2026-04-17T15:51:14Z

nice design - the centralized seam in _create_render_environment is clean. two things that need fixing before merge: (1) extract_column_names_from_expression breaks on jsonpath expressions now (verified locally), and (2) NativeJinjaSandboxEnvironment doesn't register jsonpath, so NATIVE is missing a filter that SECURE has. also worth considering whether the rendered-output guards (empty, length) should live in both modes since they're safety checks, not syntax restrictions.

Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

eric-tramel · 2026-04-17T17:32:10Z

Ran a few representative DataDesigner workflows with RunConfig.jinja_rendering_engine toggled between SECURE and NATIVE to validate the mode seam end to end.

What I ran:

Tutorial-style Jinja workflow based on docs/notebook_source/2-structured-outputs-and-jinja-expressions.py
- Expression columns
- {% if %} conditional logic
- SkipConfig
- Outcome: SECURE and NATIVE produced identical results for the supported Jinja features.
Processor workflow based on docs/concepts/processors.md
- Real create() run
- SchemaTransformProcessorConfig
- {{ category | upper }} in the processor template
- Outcome: SECURE and NATIVE produced the same main dataset and the same processor output.
Native-only broader Jinja check
- Expression: {{ tags | join(', ') }} over a list column
- Outcome: SECURE failed as expected because join is still outside the secure filter allowlist.
- Outcome: NATIVE succeeded and rendered the joined strings as expected.

Bottom line:

Supported Jinja features behave the same in both modes.
The secure-only restrictions are still enforced in SECURE.
The broader Jinja surface is available in NATIVE.

These were actual preview() / create() runs with temp artifact directories and a dummy default-named provider so the checks stayed model-free and isolated the Jinja behavior itself.

github-actions · 2026-04-17T17:54:34Z

Code Review: PR #557 — feat: add RunConfig jinja rendering engine

Summary

This PR adds a JinjaRenderingEngine enum (SECURE / NATIVE) to RunConfig, allowing users to choose between Data Designer's hardened Jinja renderer (default) and Jinja2's built-in sandbox. The change spans config, engine, and interface layers with a new NativeJinjaSandboxEnvironment class, a uniform render_template() adapter, and user-facing startup logs. A comprehensive new Security documentation page is included.

Scope: 20 files changed, +575 / -20 lines. Mix of production code (config enum, environment routing, plumbing), tests, and documentation.

CI Status: All checks passing (one engine test pending at review time).

Findings

Severity: Low

1. _get_jinja_rendering_engine uses hasattr duck-typing (environment.py:497-502)

def _get_jinja_rendering_engine(self) -> JinjaRenderingEngine:
    if hasattr(self, "_jinja_rendering_engine"):
        return JinjaRenderingEngine(getattr(self, "_jinja_rendering_engine"))
    if hasattr(self, "_resource_provider"):
        return JinjaRenderingEngine(self._resource_provider.run_config.jinja_rendering_engine)
    return JinjaRenderingEngine.SECURE

This works but is fragile — the mixin silently falls back to SECURE if neither attribute exists. A missing attribute could mask a wiring bug where the engine setting was never plumbed through. Consider adding a class-level _jinja_rendering_engine attribute with a sentinel or making it a required argument, so missing wiring fails loudly rather than silently defaulting.

2. NativeJinjaSandboxEnvironment.validate_template catches its own UserTemplateError (environment.py:423-432)

def validate_template(self, user_template: str) -> None:
    try:
        ...
        if len(unallowed_vars) > 0:
            raise UserTemplateError(...)
    except Exception as exception:
        maybe_handle_missing_filter_exception(exception, ...)
        raise exception

The UserTemplateError raised inside the try block is caught by the broad except Exception, routed through maybe_handle_missing_filter_exception (which won't match it), and then re-raised. Functionally correct, but the UserTemplateError could be raised after the try/except block or caught explicitly to avoid the unnecessary filter-check detour.

Also, raise exception should be raise to preserve the full traceback chain (PEP convention). Same comment applies to the render_template method at line 450.

3. NativeJinjaSandboxEnvironment declares class-level attribute annotations (environment.py:404-405)

class NativeJinjaSandboxEnvironment(ImmutableSandboxedEnvironment):
    allowed_references: list[str]
    _prefer_dict_key_access: bool

These class-level annotations shadow instance attributes set in __init__. On a class inheriting from Jinja2's Environment (which uses a custom metaclass), this is worth validating doesn't conflict. Tests pass, so it appears safe, but it's an unusual pattern for this codebase — the existing UserTemplateSandboxEnvironment does the same, so this is consistent.

4. extract_column_names_from_expression always uses NativeJinjaSandboxEnvironment (jinja_utils.py:73)

def extract_column_names_from_expression(expr: str) -> set[str]:
    return meta.find_undeclared_variables(NativeJinjaSandboxEnvironment().parse("{{ " + expr + " }}"))

This changed from UserTemplateSandboxEnvironment().get_references(...) to directly using meta.find_undeclared_variables on a NativeJinjaSandboxEnvironment. Since this is pure AST parsing (no rendering), the change is safe. However, it now constructs a NativeJinjaSandboxEnvironment each call just to access parse() — a minor inefficiency but not a concern unless called in hot paths.

Severity: Info / Positive Observations

5. Clean polymorphic adapter pattern

Adding render_template() to UserTemplateSandboxEnvironment as a thin wrapper around safe_render() creates a uniform interface for the factory method _create_render_environment(). This avoids modifying the existing safe_render() method and is a clean approach.

6. upper filter addition to secure allowlist (environment.py:60)

Adding upper to ALLOWED_JINJA_FILTERS is a sensible inclusion — lower was already allowed, and upper has the same risk profile.

7. jsonpath filter exposed in native sandbox (environment.py:416)

Good catch — without this, switching from SECURE to NATIVE would silently lose the jsonpath custom filter.

8. Thorough security documentation (docs/concepts/security.md)

The new Security page is well-structured with a compatibility matrix, CVE references, and clear guidance on when to use each mode. The trust-model framing (trusted/monolithic vs. untrusted/shared) is helpful for users making deployment decisions.

9. Good test coverage

Tests cover both modes at every layer: config defaults, environment rendering, prompt renderer, JinjaDataFrame, expression generators, and interface logging. The tests correctly verify that SECURE rejects filters that NATIVE allows (e.g., join).

10. Startup logging

The _log_jinja_rendering_engine_mode() method in the interface provides clear user-visible feedback about which rendering mode is active, with distinct icons for each mode.

Nits

PR body notes two unchecked test items: the full test_data_designer.py suite (pre-existing fixture mismatch) and mkdocs build --strict (pre-existing warnings). These are pre-existing issues, not introduced by this PR.

Verdict

Approve — This is a well-structured feature addition that follows the codebase's layered architecture (config -> engine -> interface). The default is secure, the opt-in is explicit, documentation is thorough, test coverage is good, and CI is green. The findings above are all low-severity suggestions for minor code quality improvements, none of which block merge.

greptile-apps · 2026-04-17T17:58:42Z

Greptile Summary

This PR introduces JinjaRenderingEngine.SECURE / JinjaRenderingEngine.NATIVE as a RunConfig toggle that controls which Jinja rendering environment is used at execution time, with SECURE as the default to preserve existing hardening behaviour. It wires the selection through the full call chain (DatasetGenerator, RecordBasedPromptRenderer, JinjaDataFrame) via a new _create_render_environment factory on the WithJinja2UserTemplateRendering mixin, adds a new NativeJinjaSandboxEnvironment class for the native path, and surfaces startup log lines so the active mode is visible to operators.

Confidence Score: 5/5

Safe to merge; SECURE is the default so existing deployments are unaffected, and the NATIVE path is an explicit opt-in with no regressions in the changed call sites.

All findings are P2 or absent. The engine-selection logic is correct, the fallback chain in _get_jinja_rendering_engine is sound, NativeJinjaSandboxEnvironment properly delegates to ImmutableSandboxedEnvironment, and the call chain propagation through DatasetGenerator, RecordBasedPromptRenderer, and JinjaDataFrame is complete and consistent. Test coverage spans unit, integration, and interface layers.

No files require special attention.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py	Core rendering seam: adds NativeJinjaSandboxEnvironment, _create_render_environment factory, and _get_jinja_rendering_engine fallback chain; render_template delegation is clean and consistent.
packages/data-designer-config/src/data_designer/config/run_config.py	Adds JinjaRenderingEngine StrEnum and jinja_rendering_engine field to RunConfig with SECURE default; public API change is backward-compatible.
packages/data-designer-engine/src/data_designer/engine/column_generators/utils/prompt_renderer.py	RecordBasedPromptRenderer now accepts and stores jinja_rendering_engine, which is picked up by the mixin's _get_jinja_rendering_engine via getattr; default is SECURE.
packages/data-designer-engine/src/data_designer/engine/sampling_gen/generator.py	DatasetGenerator (aliased as SamplingDatasetGenerator) gains jinja_rendering_engine parameter and forwards it to each JinjaDataFrame instantiation.
packages/data-designer-engine/src/data_designer/engine/sampling_gen/jinja_utils.py	JinjaDataFrame stores jinja_rendering_engine; extract_column_names_from_expression switches from UserTemplateSandboxEnvironment to NativeJinjaSandboxEnvironment for AST-only parsing (no behavioral difference).
packages/data-designer/src/data_designer/interface/data_designer.py	Adds _log_jinja_rendering_engine_mode called at the start of create() and preview(); uses LOG_INDENT for consistent formatting.
packages/data-designer-config/src/data_designer/config/init.py	JinjaRenderingEngine correctly added to both the TYPE_CHECKING import block and the lazy imports dict.
packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py	New tests verify jsonpath in NativeJinjaSandboxEnvironment, upper filter in secure env, and the mixin's secure-by-default behaviour without explicit engine wiring.
packages/data-designer-engine/tests/engine/sampling_gen/test_jinja_utils.py	Tests confirm JinjaDataFrame can switch rendering engines and defaults to secure; adds jsonpath expression to extract_column_names parametrize set.
docs/concepts/security.md	New security concept page with compatibility matrix, CVE references, and clear guidance on SECURE vs NATIVE; content is accurate against implementation.

Sequence Diagram

sequenceDiagram
    participant User
    participant DD as DataDesigner
    participant RC as RunConfig
    participant CG as ColumnGenerator<br/>(LLM/Sampler)
    participant Mix as WithJinja2UserTemplateRendering
    participant Env as Environment Factory

    User->>DD: create(config, run_config)
    DD->>RC: jinja_rendering_engine (SECURE or NATIVE)
    DD->>DD: _log_jinja_rendering_engine_mode()
    DD->>CG: generate(data)

    alt LLM column
        CG->>Mix: RecordBasedPromptRenderer(jinja_rendering_engine)
        Mix->>Mix: _get_jinja_rendering_engine()
    else Sampler column
        CG->>Mix: JinjaDataFrame(expr, jinja_rendering_engine)
        Mix->>Mix: _get_jinja_rendering_engine()
    end

    Mix->>Env: _create_render_environment(dataset_variables)

    alt engine == SECURE
        Env-->>Mix: UserTemplateSandboxEnvironment
    else engine == NATIVE
        Env-->>Mix: NativeJinjaSandboxEnvironment
    end

    Mix->>Env: validate_template(template)
    Mix->>Env: render_template(template, record)
    Env-->>Mix: rendered string
    Mix-->>CG: result
    CG-->>DD: dataset
    DD-->>User: DatasetCreationResults

_{Reviews (2): Last reviewed commit: "refactor: tighten jinja environment erro..." | Re-trigger Greptile}

andreatgretel

🚢

eric-tramel added 3 commits April 17, 2026 10:34

fix: default jinja rendering to secure

7ac7436

Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

andreatgretel reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/data-designer-engine/src/data_designer/engine/sampling_gen/jinja_utils.py Outdated

andreatgretel reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py

andreatgretel reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py

eric-tramel added 6 commits April 17, 2026 11:52

docs: add security concept guide

64ed498

Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

fix: expose jsonpath in native jinja

7dae3dd

Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

docs: refine jinja security guide

72477db

fix: allow upper in secure jinja

fa11435

feat: log jinja rendering mode at startup

ed7a89a

chore: use house icon for native jinja logs

f6ad4fe

eric-tramel added 3 commits April 17, 2026 13:39

fix: parse jsonpath in expression dependencies

a630d1b

test: stabilize jinja mode log assertions

c4bdcd6

Merge branch 'main' into codex/run-config-jinja-renderer

55c6b1a

eric-tramel marked this pull request as ready for review April 17, 2026 17:51

eric-tramel requested a review from a team as a code owner April 17, 2026 17:51

eric-tramel temporarily deployed to agentic-ci April 17, 2026 17:52 — with GitHub Actions Inactive

refactor: tighten jinja environment error handling

0e159d4

eric-tramel requested a review from andreatgretel April 17, 2026 18:02

andreatgretel approved these changes Apr 17, 2026

View reviewed changes

eric-tramel merged commit 8be4ff7 into main Apr 17, 2026
50 checks passed

andreatgretel mentioned this pull request Apr 21, 2026

ci: add graphify structural impact analysis to PR review and structure audit #567

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add RunConfig jinja rendering engine#557

feat: add RunConfig jinja rendering engine#557
eric-tramel merged 14 commits intomainfrom
codex/run-config-jinja-renderer

eric-tramel commented Apr 17, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreatgretel commented Apr 17, 2026

Uh oh!

eric-tramel commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

greptile-apps Bot commented Apr 17, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

andreatgretel left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eric-tramel commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Added

Changed

Fixed

Usage

SECURE (default)

NATIVE (explicit opt-in)

Startup Logs

Attention Areas

Related Issues

Testing

Checklist

Uh oh!

github-actions Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreatgretel commented Apr 17, 2026

Uh oh!

eric-tramel commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Code Review: PR #557 — feat: add RunConfig jinja rendering engine

Summary

Findings

Severity: Low

Severity: Info / Positive Observations

Nits

Verdict

Uh oh!

greptile-apps Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eric-tramel commented Apr 17, 2026 •

edited

Loading

`SECURE` (default)

`NATIVE` (explicit opt-in)

github-actions Bot commented Apr 17, 2026 •

edited

Loading

greptile-apps Bot commented Apr 17, 2026 •

edited

Loading