feat: add RunConfig jinja rendering engine#557
Conversation
- add a RunConfig enum/field that selects native Jinja by default while preserving ginja as an opt-in hardened mode - route shared prompt and sampler rendering through the selected engine instead of hardcoding ginja behavior - cover the new selection path with config, engine, and docs updates Refs #87 Refs #550 Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
|
Docs preview: https://5dad36cc.dd-docs-preview.pages.dev
|
- opt the ginja mixin regression tests into the hardened renderer explicitly now that RunConfig defaults engine rendering to native - update the image empty-prompt assertion to expect the native-mode ValueError surfaced by the generator Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
- rename the public RunConfig enum option from ginja to secure so the interface reads as native versus secure - update docs and tests to use the new public enum member and keep the hardened renderer wired through the same engine seam Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
|
nice design - the centralized seam in |
Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>
|
Ran a few representative What I ran:
Bottom line:
These were actual |
Code Review: PR #557 — feat: add RunConfig jinja rendering engineSummaryThis PR adds a Scope: 20 files changed, +575 / -20 lines. Mix of production code (config enum, environment routing, plumbing), tests, and documentation. CI Status: All checks passing (one engine test pending at review time). FindingsSeverity: Low1. def _get_jinja_rendering_engine(self) -> JinjaRenderingEngine:
if hasattr(self, "_jinja_rendering_engine"):
return JinjaRenderingEngine(getattr(self, "_jinja_rendering_engine"))
if hasattr(self, "_resource_provider"):
return JinjaRenderingEngine(self._resource_provider.run_config.jinja_rendering_engine)
return JinjaRenderingEngine.SECUREThis works but is fragile — the mixin silently falls back to 2. def validate_template(self, user_template: str) -> None:
try:
...
if len(unallowed_vars) > 0:
raise UserTemplateError(...)
except Exception as exception:
maybe_handle_missing_filter_exception(exception, ...)
raise exceptionThe Also, 3. class NativeJinjaSandboxEnvironment(ImmutableSandboxedEnvironment):
allowed_references: list[str]
_prefer_dict_key_access: boolThese class-level annotations shadow instance attributes set in 4. def extract_column_names_from_expression(expr: str) -> set[str]:
return meta.find_undeclared_variables(NativeJinjaSandboxEnvironment().parse("{{ " + expr + " }}"))This changed from Severity: Info / Positive Observations5. Clean polymorphic adapter pattern Adding 6. Adding 7. Good catch — without this, switching from 8. Thorough security documentation (docs/concepts/security.md) The new Security page is well-structured with a compatibility matrix, CVE references, and clear guidance on when to use each mode. The trust-model framing (trusted/monolithic vs. untrusted/shared) is helpful for users making deployment decisions. 9. Good test coverage Tests cover both modes at every layer: config defaults, environment rendering, prompt renderer, JinjaDataFrame, expression generators, and interface logging. The tests correctly verify that SECURE rejects filters that NATIVE allows (e.g., 10. Startup logging The Nits
VerdictApprove — This is a well-structured feature addition that follows the codebase's layered architecture (config -> engine -> interface). The default is secure, the opt-in is explicit, documentation is thorough, test coverage is good, and CI is green. The findings above are all low-severity suggestions for minor code quality improvements, none of which block merge. |
Greptile SummaryThis PR introduces
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py | Core rendering seam: adds NativeJinjaSandboxEnvironment, _create_render_environment factory, and _get_jinja_rendering_engine fallback chain; render_template delegation is clean and consistent. |
| packages/data-designer-config/src/data_designer/config/run_config.py | Adds JinjaRenderingEngine StrEnum and jinja_rendering_engine field to RunConfig with SECURE default; public API change is backward-compatible. |
| packages/data-designer-engine/src/data_designer/engine/column_generators/utils/prompt_renderer.py | RecordBasedPromptRenderer now accepts and stores jinja_rendering_engine, which is picked up by the mixin's _get_jinja_rendering_engine via getattr; default is SECURE. |
| packages/data-designer-engine/src/data_designer/engine/sampling_gen/generator.py | DatasetGenerator (aliased as SamplingDatasetGenerator) gains jinja_rendering_engine parameter and forwards it to each JinjaDataFrame instantiation. |
| packages/data-designer-engine/src/data_designer/engine/sampling_gen/jinja_utils.py | JinjaDataFrame stores jinja_rendering_engine; extract_column_names_from_expression switches from UserTemplateSandboxEnvironment to NativeJinjaSandboxEnvironment for AST-only parsing (no behavioral difference). |
| packages/data-designer/src/data_designer/interface/data_designer.py | Adds _log_jinja_rendering_engine_mode called at the start of create() and preview(); uses LOG_INDENT for consistent formatting. |
| packages/data-designer-config/src/data_designer/config/init.py | JinjaRenderingEngine correctly added to both the TYPE_CHECKING import block and the lazy imports dict. |
| packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py | New tests verify jsonpath in NativeJinjaSandboxEnvironment, upper filter in secure env, and the mixin's secure-by-default behaviour without explicit engine wiring. |
| packages/data-designer-engine/tests/engine/sampling_gen/test_jinja_utils.py | Tests confirm JinjaDataFrame can switch rendering engines and defaults to secure; adds jsonpath expression to extract_column_names parametrize set. |
| docs/concepts/security.md | New security concept page with compatibility matrix, CVE references, and clear guidance on SECURE vs NATIVE; content is accurate against implementation. |
Sequence Diagram
sequenceDiagram
participant User
participant DD as DataDesigner
participant RC as RunConfig
participant CG as ColumnGenerator<br/>(LLM/Sampler)
participant Mix as WithJinja2UserTemplateRendering
participant Env as Environment Factory
User->>DD: create(config, run_config)
DD->>RC: jinja_rendering_engine (SECURE or NATIVE)
DD->>DD: _log_jinja_rendering_engine_mode()
DD->>CG: generate(data)
alt LLM column
CG->>Mix: RecordBasedPromptRenderer(jinja_rendering_engine)
Mix->>Mix: _get_jinja_rendering_engine()
else Sampler column
CG->>Mix: JinjaDataFrame(expr, jinja_rendering_engine)
Mix->>Mix: _get_jinja_rendering_engine()
end
Mix->>Env: _create_render_environment(dataset_variables)
alt engine == SECURE
Env-->>Mix: UserTemplateSandboxEnvironment
else engine == NATIVE
Env-->>Mix: NativeJinjaSandboxEnvironment
end
Mix->>Env: validate_template(template)
Mix->>Env: render_template(template, record)
Env-->>Mix: rendered string
Mix-->>CG: result
CG-->>DD: dataset
DD-->>User: DatasetCreationResults
Reviews (2): Last reviewed commit: "refactor: tighten jinja environment erro..." | Re-trigger Greptile
Summary
This PR adds a
RunConfig-level selector for engine-side Jinja rendering so users can choose between Data Designer's hardened renderer and the broader native Jinja2 sandbox. The public interface isJinjaRenderingEngine.SECUREversusJinjaRenderingEngine.NATIVE, withSECUREas the default so existing deployments do not silently lose template hardening. It also adds user-facing startup logs socreateandpreviewshow which Jinja mode is active.Changes
Added
JinjaRenderingEnginetoRunConfigand export it fromdata_designer.configSecurityconcept page that explains trusted versus untrusted deployment models,SECUREversusNATIVE, and the extra hardening provided by the secure rendererpackages/data-designer/src/data_designer/interface/data_designer.pythat show the active Jinja mode with🔒/🏠Changed
RunConfigtoJinjaRenderingEngine.SECURE, withNATIVEavailable as an explicit opt-inpackages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.pythrough either the native Jinja sandbox or the existing hardened rendererRunConfigSECUREupperin the secure filter allowlist while keeping the explicit narrow filter policy inenvironment.pydocs/code_reference/run_config.mdanddocs/concepts/deployment-options.mdto point users to the new security guidancedocs/concepts/security.mdwith a compatibility matrix, implementation-backed rationale, filter guidance, and release-pinned source linksFixed
jsonpathin the native sandbox so switching fromSECUREtoNATIVEdoes not lose the Data Designer filterUsage
SECURE(default)NATIVE(explicit opt-in)Startup Logs
Attention Areas
packages/data-designer-config/src/data_designer/config/run_config.py- this is the public API and default-behavior change that affects existing deploymentspackages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py- this remains the central renderer-selection seam and the secure filter policy implementationpackages/data-designer/src/data_designer/interface/data_designer.py- this is the user-facing logging change forcreateandpreviewdocs/concepts/security.md- this is the main user-facing explanation of whySECUREexists and whenNATIVEis appropriateRelated Issues
Testing
uv run --all-packages ruff check packages/data-designer-config/src/data_designer/config/run_config.py packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py packages/data-designer-engine/src/data_designer/engine/column_generators/utils/prompt_renderer.py packages/data-designer-engine/src/data_designer/engine/sampling_gen/jinja_utils.py packages/data-designer-engine/src/data_designer/engine/sampling_gen/generator.py packages/data-designer-config/tests/config/test_run_config.py packages/data-designer-engine/tests/engine/column_generators/utils/test_prompt_renderer.py packages/data-designer-engine/tests/engine/sampling_gen/test_jinja_utils.py packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py packages/data-designer-engine/tests/engine/column_generators/generators/test_image.py docs/code_reference/run_config.md docs/concepts/security.mduv run --all-packages pytest packages/data-designer-config/tests/config/test_run_config.pyuv run --all-packages pytest packages/data-designer-engine/testsuv run --all-packages pytest packages/data-designer/tests/interface/test_data_designer.py -k "logs_secure_jinja_rendering_mode or logs_native_jinja_rendering_mode"uv run --all-packages pytest packages/data-designer/tests/interface/test_data_designer.pyThis file currently has an unrelated default-provider mismatch in the existing test fixture setup (
brr-localvsstub-model-provider).uv run --group docs mkdocs build --strictCurrent repo-wide documentation warnings still cause strict mode to abort; the new
Securitypage did not introduce additional strict-mode failures.Checklist
Description updated with AI