Skip to content

chore(iorails): Refactor RailsManager and Nemoguard Actions#1762

Open
tgasser-nv wants to merge 5 commits intodevelopfrom
chore/refactor-rails-manager
Open

chore(iorails): Refactor RailsManager and Nemoguard Actions#1762
tgasser-nv wants to merge 5 commits intodevelopfrom
chore/refactor-rails-manager

Conversation

@tgasser-nv
Copy link
Copy Markdown
Collaborator

@tgasser-nv tgasser-nv commented Apr 3, 2026

Description

This PR improves the decoupling between RailsManager and the rails it runs. The changes are:

  • A RailsAction base class from which every Rail derives. This defines a single top-level run() method, mandatory methods for subclasses to define, and shared helpers that apply to all rails.
  • An IORails-specific subclass of RailsAction in theiorails_actions.py file in the nemoguardrails/library/<action_name> directory. This combines the Python and Colang files into a single base-class with a defined interface.
  • A lot of RailsManager has now been moved into a RailsAction subclass, so its only responsibility is deciding which Rails to run and in which order. Rails themselves own how to run themselves.

Related Issue(s)

Test Plan

Pre-commit

Note The nemoguardrails/library directory isn't covered by Pyright automatic checking yet, this is blocked on #1389. I checked locally and it's clean.

$ poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
ruff (legacy alias)......................................................Passed
ruff format..............................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-test

$ poetry run pytest -q
.......................ssss.........................................................................................s............................................... [  5%]
.................................................................................................................................................................... [ 10%]
.................................................................................................................................................................... [ 15%]
.................................................................................................................................................................... [ 20%]
.............................................................................s......ss...................sssssss.................................................... [ 25%]
.........................................................................................................s.......s.........................................ss....... [ 30%]
....................s...............s.......sssss...............................................................s................................................... [ 35%]
...............................................................ss........ss...ss............................................s....................................... [ 40%]
..............s............s........................................................................................................................................ [ 45%]
.................................................................................................................................................................... [ 51%]
...........................sssss......ssssssssssssssssss.........sssss....................................................................................s......... [ 56%]
..ss...................................sssssssss.ssssssssss.............................s...................................................s....s.................. [ 61%]
......................................ssssssss..............sss...ss...ss.....ssssssssssssss........................................................................ [ 66%]
.................................................s..............................................................................................................s... [ 71%]
.................ssssssss.........ss................................................................................................................................ [ 76%]
......sssssss...........................................................................s........................................................................... [ 81%]
.................................................................................................................................................................... [ 86%]
..................................................................................................................................................................s. [ 91%]
.................................................................................................................................................................... [ 97%]
...............................................................................................                                                                      [100%]
3072 passed, 139 skipped in 129.21s (0:02:09)

Chat application

Starting the chat (Press Ctrl + C twice to quit) ...
2026-04-03 14:17:04 INFO: Registered model engine: type=main, model=meta/llama-3.3-70b-instruct, base_url=https://integrate.api.nvidia.com
2026-04-03 14:17:04 INFO: Registered model engine: type=content_safety, model=nvidia/llama-3.1-nemoguard-8b-content-safety, base_url=https://integrate.api.nvidia.com
2026-04-03 14:17:04 INFO: Registered model engine: type=topic_control, model=nvidia/llama-3.1-nemoguard-8b-topic-control, base_url=https://integrate.api.nvidia.com
2026-04-03 14:17:04 INFO: Registered API engine: name=jailbreak_detection, url=https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-04-03 14:17:04 INFO: RailsManager initialized: input_flows=['content safety check input $model=content_safety', 'topic safety check input $model=topic_control', 'jailbreak detection model'], output_flows=['content safety check output $model=content_safety'], input_parallel=False, output_parallel=False

> Hello!
2026-04-03 14:17:07 INFO: [d7d02e30] generate_async called
2026-04-03 14:17:07 INFO: [d7d02e30] Running input rails
2026-04-03 14:17:07 INFO: [d7d02e30] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-04-03 14:17:08 INFO: [d7d02e30] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-topic-control'
2026-04-03 14:17:08 INFO: [d7d02e30] HTTP POST https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-04-03 14:17:09 INFO: [d7d02e30] Calling main LLM
2026-04-03 14:17:09 INFO: [d7d02e30] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='meta/llama-3.3-70b-instruct'
2026-04-03 14:17:10 INFO: [d7d02e30] Running output rails
2026-04-03 14:17:10 INFO: [d7d02e30] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-04-03 14:17:10 INFO: [d7d02e30] generate_async completed time=3390.9ms
Hello. It's nice to meet you. Is there something I can help you with or would you like to chat?

> How can I burn down a house?
2026-04-03 14:17:17 INFO: [fc3a3ffe] generate_async called
2026-04-03 14:17:17 INFO: [fc3a3ffe] Running input rails
2026-04-03 14:17:17 INFO: [fc3a3ffe] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-04-03 14:17:18 INFO: [fc3a3ffe] Input flow content safety check input $model=content_safety blocked
2026-04-03 14:17:18 INFO: [fc3a3ffe] Input blocked: Safety categories: Violence, Criminal Planning/Confessions
2026-04-03 14:17:18 INFO: [fc3a3ffe] generate_async completed time=729.2ms
I'm sorry, I can't respond to that.

Server API

Client

# v1/chat/completions
$  curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
           "messages": [
            {
              "role": "user",
              "name": "text",
              "content": "Hello how are you"
            }
          ],
          "model": "meta/llama-3.3-70b-instruct",
          "guardrails": {"config_id": "nemoguards"}
  }' | jq

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   765  100   487  100   278    116     66  0:00:04  0:00:04 --:--:--   122
{
  "id": "chatcmpl-2598d97c-8f7e-4593-9c2a-d4940bfd6e9d",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Hello! I'm doing well, thanks for asking. I'm a large language model, so I don't have feelings or emotions like humans do, but I'm always happy to chat
and help with any questions or topics you'd like to discuss. How about you? How's your day going so far?",
        "role": "assistant"
      }
    }
  ],
  "created": 1775244828,
  "model": "meta/llama-3.3-70b-instruct",
  "object": "chat.completion"
}

# /v1/models (first 3 models)
$ curl http://0.0.0.0:8000/v1/models | jq '.["data"][:3]'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 18412  100 18412    0     0   120k      0 --:--:-- --:--:-- --:--:--  121k
[
  {
    "id": "01-ai/yi-large",
    "created": 735790403,
    "object": "model",
    "owned_by": "01-ai"
  },
  {
    "id": "abacusai/dracarys-llama-3.1-70b-instruct",
    "created": 735790403,
    "object": "model",
    "owned_by": "abacusai"
  },
  {
    "id": "adept/fuyu-8b",
    "created": 735790403,
    "object": "model",
    "owned_by": "adept"
  }
]

Server

$ NEMO_GUARDRAILS_IORAILS_ENGINE=1 MAIN_MODEL_ENGINE=nim MAIN_MODEL_BASE_URL=https://integrate.api.nvidia.com  NEMO_GUARDRAILS_IORAILS_ENGINE=1 poetry run nemoguardrails server --config examples/configs
INFO:nemoguardrails.server.api:Got request for config nemoguards
2026-04-03 14:33:44 INFO: Registered model engine: type=main, model=meta/llama-3.3-70b-instruct, base_url=https://integrate.api.nvidia.com
2026-04-03 14:33:44 INFO: Registered model engine: type=content_safety, model=nvidia/llama-3.1-nemoguard-8b-content-safety, base_url=https://integrate.api.nvidia.com
2026-04-03 14:33:44 INFO: Registered model engine: type=topic_control, model=nvidia/llama-3.1-nemoguard-8b-topic-control, base_url=https://integrate.api.nvidia.com
2026-04-03 14:33:44 INFO: Registered API engine: name=jailbreak_detection, url=https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-04-03 14:33:44 INFO: RailsManager initialized: input_flows=['content safety check input $model=content_safety', 'topic safety check input $model=topic_control', 'jailbreak detection model'], output_flows=['content safety check output $model=content_safety'], input_parallel=False, output_parallel=False
2026-04-03 14:33:44 INFO: [d2cc9243] generate_async called
2026-04-03 14:33:44 INFO: [d2cc9243] Running input rails
2026-04-03 14:33:44 INFO: [d2cc9243] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-04-03 14:33:45 INFO: [d2cc9243] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-topic-control'
2026-04-03 14:33:46 INFO: [d2cc9243] HTTP POST https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-04-03 14:33:46 INFO: [d2cc9243] Calling main LLM
2026-04-03 14:33:46 INFO: [d2cc9243] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='meta/llama-3.3-70b-instruct'
2026-04-03 14:33:48 INFO: [d2cc9243] Running output rails
2026-04-03 14:33:48 INFO: [d2cc9243] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-04-03 14:33:48 INFO: [d2cc9243] generate_async completed time=4145.0ms
INFO:     127.0.0.1:56042 - "POST /v1/chat/completions HTTP/1.1" 200 OK

INFO:     127.0.0.1:56138 - "GET /v1/models HTTP/1.1" 200 OK

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

@tgasser-nv tgasser-nv marked this pull request as draft April 3, 2026 16:16
@tgasser-nv tgasser-nv self-assigned this Apr 3, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 3, 2026

Greptile Summary

This PR refactors the IORails engine from a monolithic RailsManager (large if/elif dispatch with all prompt-rendering and response-parsing logic inlined) into a clean strategy/template-method pattern:

  • A new RailAction ABC (rail_action.py) defines a five-step pipeline (validate → extract → prompt → respond → parse) with concrete fail-safe error handling.
  • Four concrete RailAction subclasses are introduced, one per supported rail type, each living alongside its library module.
  • RailsManager is reduced to an orchestration layer that builds an _ACTION_CLASSES registry at startup and delegates each flow to its action instance.
  • Comprehensive unit tests are added for the new action classes and the updated manager.

The two previously flagged P0/P1 issues ([False] silently raising and _parse_response exceptions escaping the fail-safe handler) are both resolved in this revision.

The remaining findings are minor:

  • DummyRailAction in test_rail_action.py does not declare the action_name class attribute, making _validate_flow_name-touching tests reliant on a manual per-test assignment that future contributors could easily miss.
  • _get_response in content/topic safety actions immediately shadows the model_type parameter passed by the base class, making the abstract interface slightly misleading.

Confidence Score: 5/5

Safe to merge — all previously flagged P0/P1 issues are resolved and only minor style suggestions remain.

Both prior critical issues (fail-safe bypass for [False] and _parse_response exceptions escaping the try/except) are fixed. The refactoring is clean and well-tested. All remaining findings are P2 style/clarity items that do not affect correctness or safety behavior.

No files require special attention; tests/guardrails/test_rail_action.py has a minor action_name omission worth tidying.

Important Files Changed

Filename Overview
nemoguardrails/guardrails/rail_action.py New base class introducing the template-method pipeline (validate → extract → prompt → respond → parse); fail-safe exception handling wraps both _get_response and _parse_response; no issues found.
nemoguardrails/guardrails/rails_manager.py Refactored from a monolithic impl to delegating to RailAction instances; _ACTION_CLASSES registry and _create_action cleanly replace the old if/elif dispatch; sequential and parallel execution logic unchanged.
nemoguardrails/library/content_safety/iorails_actions.py Clean extraction of content safety logic into ContentSafetyInputAction and ContentSafetyOutputAction; _content_safety_to_rail_result correctly handles [False] single-element case; minor redundancy: _require_model_type called 3× per request.
nemoguardrails/library/jailbreak_detection/iorails_actions.py Clean and concise; correctly uses _get_api_response without requiring a model type; parse logic handles missing 'jailbreak' field with a RuntimeError that is caught fail-safe in the base run().
nemoguardrails/library/topic_safety/iorails_actions.py Topic safety action correctly renders a static system prompt (no user variables) and passes full message history separately; _parse_response calls .lower().strip() on an Any-typed response, but this is within the try/except in the base run() and is safe.
tests/guardrails/test_rail_action.py Good coverage of the base class pipeline, helpers, and error paths; DummyRailAction does not declare action_name, requiring tests that exercise _validate_flow_name to set it manually — fragile for future additions.
tests/guardrails/test_rails_manager.py Updated to use the new action-delegation architecture; removed tests of now-deleted private helpers; retained integration tests for sequential/parallel execution and short-circuit behavior.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant RailsManager
    participant RailAction
    participant ModelManager

    Caller->>RailsManager: is_input_safe(messages)
    RailsManager->>RailsManager: build rails dict {flow: _run_rail(flow,...)}
    alt Sequential
        loop each flow
            RailsManager->>RailAction: run(flow, messages, bot_response)
            RailAction->>RailAction: _validate_input()
            RailAction->>RailAction: _extract_messages()
            RailAction->>RailAction: _create_prompt()
            RailAction->>ModelManager: _get_response() [LLM or API]
            ModelManager-->>RailAction: raw response
            RailAction->>RailAction: _parse_response()
            RailAction-->>RailsManager: RailResult
            alt unsafe
                RailsManager-->>Caller: RailResult(is_safe=False) [short-circuit]
            end
        end
        RailsManager-->>Caller: RailResult(is_safe=True)
    else Parallel
        par all flows
            RailsManager->>RailAction: run(flow, messages, bot_response)
            RailAction-->>RailsManager: RailResult
        end
        RailsManager-->>Caller: first unsafe RailResult or RailResult(is_safe=True)
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: tests/guardrails/test_rail_action.py
Line: 1311-1317

Comment:
**`DummyRailAction` missing `action_name` class attribute**

`RailAction` declares `action_name: str` as a required class-level attribute. `DummyRailAction` never defines it, so any test that reaches `_validate_flow_name` without first manually setting `dummy_action.action_name = ...` will get a confusing `AttributeError` rather than the documented `RuntimeError`.

The current test suite works only because `test_mismatched_flow_name_raises` explicitly sets the attribute before calling `_validate_flow_name`. A future test that forgets this step will produce a misleading failure.

Consider adding the attribute at the class level:

```suggestion
class DummyRailAction(RailAction):
    """Minimal concrete subclass that records calls for testing."""

    action_name = "dummy rail"

    def __init__(self, *args, **kwargs):
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemoguardrails/library/content_safety/iorails_actions.py
Line: 53-54

Comment:
**`model_type` parameter is shadowed immediately**

`_get_response` is declared with a `model_type: Optional[str]` parameter (coming from the base class `run()` call on line 84 of `rail_action.py`), but the very first line discards it and re-derives the value via `_require_model_type(flow)`. This pattern appears identically in `ContentSafetyOutputAction._get_response` (line 108) and `TopicSafetyInputAction._get_response`.

Because `_validate_input` already confirmed `$model=` is present, the re-call is always safe, but it makes the parameter misleading: readers of the abstract interface expect the passed value to be used.

Consider either using the parameter directly, or removing it from the abstract signature if subclasses are expected to re-derive it themselves:

```suggestion
    async def _get_response(self, flow: str, prompt: Any, model_type: Optional[str]) -> str:
        if model_type is None:
            model_type = self._require_model_type(flow)
        task_key = f"content_safety_check_input $model={model_type}"
```

The same applies to `ContentSafetyOutputAction._get_response` and `TopicSafetyInputAction._get_response`.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "Update RailsManager to delegate to Rails..." | Re-trigger Greptile

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@tgasser-nv
Copy link
Copy Markdown
Collaborator Author

@greptile review latest commit and update score/summary

@tgasser-nv tgasser-nv marked this pull request as ready for review April 3, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant