fix(llmrails): backfill embedding model params into search provider config (fixes stale KB cache) by Pouyanpi · Pull Request #1753 · NVIDIA-NeMo/Guardrails

Pouyanpi · 2026-03-31T08:02:40Z

#‪# Description

Resolves #1751

fix KB annoy index cache not invalidating when the embedding model is changed via the models: [{type: embeddings, ...}] config section
When a models[type=embeddings] entry is found, backfill embedding_model and embedding_engine into embedding_search_provider.parameters for both core and knowledge_base configs so the cache hash includes the actual model name
Previously, switching embedding models (e.g., FastEmbed to OpenAI) caused IndexError: Vector has wrong length at runtime with no clear error message

Test plan

run with default FastEmbed model to create KB cache (e.g., poetry run nemoguardrails chat --config=./examples/bots/abc)
switch to OpenAI embeddings via models section (add

- type: embeddings
    engine: openai
    model: text-embedding-3-small

verify new cache file is created (different hash) and no vector length error
run existing embedding tests: pytest tests/ -k embedding

or just run (ensure .cache dir is deleted prior running it):

from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.rails.llm.config import Model

config1 = RailsConfig.from_path("./examples/bots/abc")
rails1 = LLMRails(config1)
r1 = rails1.generate(messages=[{"role": "user", "content": "Hello"}])
print(f"Run 1 (default FastEmbed): {r1['content']}")

config2 = RailsConfig.from_path("./examples/bots/abc")
config2.models.append(Model(type="embeddings", engine="openai", model="text-embedding-3-small"))
rails2 = LLMRails(config2)
r2 = rails2.generate(messages=[{"role": "user", "content": "Hello"}])
print(f"Run 2 (OpenAI embeddings): {r2['content']}")

Expected Output:

After the Fix:

Run 1 (default FastEmbed): Hello! How can I assist you today?
Run 2 (OpenAI embeddings): Hello! How can I assist you today?

Prior to the Fix:

Run 1 (default FastEmbed): Hello! How can I help you today?
WARNING:nemoguardrails.actions.action_dispatcher:Error while execution 'retrieve_relevant_chunks' with parameters '{'context': {'last_user_message': 'Hello', 'last_bot_message': None, 'user_message': 'Hello', 'input_flows': ['self check input'], 'i': 1, 'triggered_input_rail': None, 'allowed': True, 'event': {'type': 'StartInternalSystemAction', 'uid': '341a946f-75e5-49b0-a44a-a811f47851f0', 'event_created_at': '2026-03-31T13:16:09.424440+00:00', 'source_uid': 'NeMoGuardrails', 'action_name': 'retrieve_relevant_chunks', 'action_params': {}, 'action_result_key': None, 'action_uid': 'ed2433e0-7c6c-43a4-88bd-ede638ee8e0c', 'is_system_action': True}}, 'kb': <nemoguardrails.kb.kb.KnowledgeBase object at 0x10f30ad50>}': Vector has wrong length (expected 384, got 1536)
ERROR:nemoguardrails.actions.action_dispatcher:Vector has wrong length (expected 384, got 1536)
Traceback (most recent call last):
  File "develop/nemoguardrails/actions/action_dispatcher.py", line 217, in execute_action
    result = await result
             ^^^^^^^^^^^^
  File "develop/nemoguardrails/actions/retrieve_relevant_chunks.py", line 65, in retrieve_relevant_chunks
    chunks = [chunk["body"] for chunk in await kb.search_relevant_chunks(user_message)]
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "develop/nemoguardrails/kb/kb.py", line 179, in search_relevant_chunks
    results = await self.index.search(text, max_results=max_results)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "nemoguardrails/embeddings/basic.py", line 282, in search
    results = self._index.get_nns_by_vector(
        _embedding,
        max_results,
        include_distances=True,
    )
IndexError: Vector has wrong length (expected 384, got 1536)
Run 2 (OpenAI embeddings): I'm sorry, an internal error has occurred.

Summary by CodeRabbit

Release Notes

New Features
- Embedding search provider configuration now automatically injects embedding model and engine parameters during initialization for default providers when values are not explicitly provided, while preserving any user-specified settings.
Tests
- Added comprehensive test coverage validating embedding search provider parameter configuration, including backfilling behavior, explicit value preservation, and custom provider handling.

…odels config

greptile-apps · 2026-03-31T08:06:10Z

Greptile Summary

This PR fixes a cache-invalidation bug in the KB annoy index where switching the embedding model via the models[type=embeddings] config section had no effect on the computed cache hash, causing a silent IndexError: Vector has wrong length at runtime.

The fix backfills embedding_model and embedding_engine into embedding_search_provider.parameters for both core and knowledge_base configs immediately after the embedding model loop in LLMRails.__init__. This ensures the hash computed in kb.py (which reads from embedding_search_provider.parameters) correctly varies with the configured model.

Key implementation details:

Only backfills when esp.name == \"default\", correctly leaving custom providers untouched
Guards model.model is not None (since model is Optional[str]) before writing — addressing the concern raised in the prior thread
model.engine is not None guard is technically redundant because engine: str is a required, non-optional field and can never be None, but it is harmless defensive code
Five new unit tests cover all relevant scenarios: fresh backfill, no-overwrite, partial fill, custom provider skip, and no-embeddings-model baseline

Confidence Score: 5/5

Safe to merge — the fix is minimal, well-guarded, and fully covered by new tests; no P0/P1 issues found.

The prior thread concern about None backfill has been addressed with explicit guards. The engine guard is redundant but harmless. All five test scenarios cover expected behavior. No logic or data-integrity regressions identified.

No files require special attention.

Important Files Changed

Filename	Overview
nemoguardrails/rails/llm/llmrails.py	Adds backfill logic for embedding_model/embedding_engine into embedding_search_provider.parameters for both core and knowledge_base configs when a models[type=embeddings] entry is found; correctly guards against None model.model and only targets the default provider.
tests/test_llmrails.py	Adds five targeted test cases covering: backfill of both providers, no overwrite of explicit params, partial backfill, no backfill for custom providers, and no backfill when embeddings model is absent; coverage is comprehensive.

Sequence Diagram

sequenceDiagram
    participant User as User Config (YAML)
    participant LLMRails as LLMRails.__init__
    participant ESP_Core as core.embedding_search_provider
    participant ESP_KB as knowledge_base.embedding_search_provider
    participant KB as kb.py build()

    User->>LLMRails: models[type=embeddings] entry
    LLMRails->>LLMRails: set default_embedding_model/engine
    LLMRails->>ESP_Core: if name=="default" and key missing → backfill embedding_model/engine
    LLMRails->>ESP_KB: if name=="default" and key missing → backfill embedding_model/engine
    KB->>ESP_KB: parameters.get("embedding_engine","") + parameters.get("embedding_model","")
    KB->>KB: hash_prefix + content → cache hash
    KB->>KB: load or build annoy index with correct model

_{Reviews (3): Last reviewed commit: "assert KB provider gets backfill when co..." | Re-trigger Greptile}

nemoguardrails/rails/llm/llmrails.py

coderabbitai · 2026-03-31T08:07:01Z

📝 Walkthrough

Walkthrough

The code now populates embedding search provider parameters with embedding model and engine values during LLMRails initialization. When a default embedding search provider is configured, the system backfills missing embedding_model and embedding_engine parameters from the configured embeddings model, ensuring the cache hash reflects the actual embedding configuration.

Changes

Cohort / File(s)	Summary
Embedding provider initialization `nemoguardrails/rails/llm/llmrails.py`	Updated `LLMRails.__init__` to backfill `embedding_model` and `embedding_engine` into `config.core.embedding_search_provider.parameters` and `config.knowledge_base.embedding_search_provider.parameters` when the provider name is "default" and these keys are absent, using values from the configured embeddings model.
Embedding provider parameter tests `tests/test_llmrails.py`	Added five comprehensive test cases validating parameter backfill behavior: (1) backfilling missing parameters for both core and knowledge_base; (2) preserving explicitly provided parameters without overwriting; (3) partial backfill for only missing keys; (4) skipping backfill for non-default custom providers; (5) no backfill when no embeddings model is configured.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	The code changes implement the core fix for `#1751` by backfilling embedding model parameters into embedding_search_provider.parameters, which makes the cache hash include the actual model name.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to addressing `#1751`: embedding parameter backfilling logic and comprehensive test coverage for that functionality.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes	✅ Passed	Bug fix PR includes comprehensive unit tests (+119 lines) and documented test plan validating KB cache invalidation behavior across multiple scenarios.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: backfilling embedding model parameters into search provider configuration to fix stale KB cache issues.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/kb-cache-invalidation-on-embedding-model-change

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tests/test_llmrails.py (1)

1439-1456: Consider also asserting knowledge_base backfill in this test.

This test verifies that custom providers are skipped, but only checks core.embedding_search_provider. The knowledge_base.embedding_search_provider defaults to name: "default" and should receive the backfill. Adding assertions for the knowledge_base would strengthen this test by verifying the selectivity of the backfill logic—ensuring it correctly applies to default providers while skipping custom ones in the same initialization.

💡 Optional enhancement to verify selective backfill

 def test_embedding_model_no_backfill_for_custom_provider():
     config = RailsConfig.parse_object(
         {
             **EMBEDDING_MODEL_CONFIG_BASE,
             "core": {
                 "embedding_search_provider": {
                     "name": "custom",
                     "parameters": {"some_param": "value"},
                 }
             },
         }
     )

     rails = LLMRails(config=config, llm=FakeLLM(responses=["  express greeting"]))

     assert "embedding_model" not in rails.config.core.embedding_search_provider.parameters
     assert "embedding_engine" not in rails.config.core.embedding_search_provider.parameters
     assert rails.config.core.embedding_search_provider.parameters["some_param"] == "value"
+    # knowledge_base remains default and should still be backfilled
+    assert rails.config.knowledge_base.embedding_search_provider.parameters["embedding_model"] == "intfloat/e5-large-v2"
+    assert rails.config.knowledge_base.embedding_search_provider.parameters["embedding_engine"] == "SentenceTransformers"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_llmrails.py` around lines 1439 - 1456, Extend
test_embedding_model_no_backfill_for_custom_provider to also assert that the
knowledge_base.embedding_search_provider received the backfill: after creating
rails = LLMRails(...), add assertions that
rails.config.knowledge_base.embedding_search_provider.parameters contains the
backfilled keys ("embedding_model" and/or "embedding_engine") and that their
values match the expected values from EMBEDDING_MODEL_CONFIG_BASE (so the custom
provider remains untouched while the default knowledge_base provider is
populated).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_llmrails.py`:
- Around line 1439-1456: Extend
test_embedding_model_no_backfill_for_custom_provider to also assert that the
knowledge_base.embedding_search_provider received the backfill: after creating
rails = LLMRails(...), add assertions that
rails.config.knowledge_base.embedding_search_provider.parameters contains the
backfilled keys ("embedding_model" and/or "embedding_engine") and that their
values match the expected values from EMBEDDING_MODEL_CONFIG_BASE (so the custom
provider remains untouched while the default knowledge_base provider is
populated).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dfaf86f1-bea6-45cf-a0d1-0b1c2dcef416

📥 Commits

Reviewing files that changed from the base of the PR and between 94fc57e and db7ff10.

📒 Files selected for processing (2)

nemoguardrails/rails/llm/llmrails.py
tests/test_llmrails.py

codecov · 2026-03-31T08:11:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

tgasser-nv

Looks good, approving! Not blocking for this PR, but items to consider as follow-ons:

It feels a bit wrong for an object to mutate its data object passed in during initialization. I'd prefer to handle this at the Pydantic validation stage.
Some embedding models (Gemini) have a dimensionality value in embedding_parameters which we'd need to add to item hash. Without this if a users changes the dimensionality we won't invalidate the cache and cause errors.

tgasser-nv · 2026-04-02T18:56:33Z

nemoguardrails/rails/llm/llmrails.py

+                ]:
+                    if esp.name != "default":
+                        continue
+                    if "embedding_model" not in esp.parameters and model.model is not None:


nit: The Model model and engine fields can't be None. The model fields has its own validator, and engine isn't optional so it can't be None. I think the and model.{model,engine} is not None checks can be removed

fix(llmrails): invalidate KB cache when embedding model changes via m…

db7ff10

…odels config

greptile-apps bot reviewed Mar 31, 2026

View reviewed changes

nemoguardrails/rails/llm/llmrails.py Show resolved Hide resolved

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

Pouyanpi added 2 commits March 31, 2026 10:12

ensure that "embedding_model" and "embedding_engine" parameters are set

2f147db

assert KB provider gets backfill when core uses custom provider

61b132b

Pouyanpi self-assigned this Mar 31, 2026

Pouyanpi added the bug Something isn't working label Mar 31, 2026

Pouyanpi added this to the v0.22.0 milestone Mar 31, 2026

Pouyanpi changed the title ~~fix(llmrails): invalidate KB cache when embedding model changes via models config~~ fix(llmrails): backfill embedding model params into search provider config to fix stale KB cache Mar 31, 2026

Pouyanpi changed the title ~~fix(llmrails): backfill embedding model params into search provider config to fix stale KB cache~~ fix(llmrails): backfill embedding model params into search provider config (fixes stale KB cache) Mar 31, 2026

tgasser-nv approved these changes Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llmrails): backfill embedding model params into search provider config (fixes stale KB cache)#1753

fix(llmrails): backfill embedding model params into search provider config (fixes stale KB cache)#1753
Pouyanpi wants to merge 3 commits intodevelopfrom
fix/kb-cache-invalidation-on-embedding-model-change

Pouyanpi commented Mar 31, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 31, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

Uh oh!

coderabbitai bot commented Mar 31, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Mar 31, 2026

Uh oh!

tgasser-nv left a comment

Uh oh!

tgasser-nv Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Pouyanpi commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

greptile-apps bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

coderabbitai bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 31, 2026

Codecov Report

Uh oh!

tgasser-nv left a comment

Choose a reason for hiding this comment

Uh oh!

tgasser-nv Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pouyanpi commented Mar 31, 2026 •

edited

Loading

greptile-apps bot commented Mar 31, 2026 •

edited

Loading

coderabbitai bot commented Mar 31, 2026 •

edited

Loading