fix(cuga-lite): handle exception in find_tools when shortlister LLM fails by haroldship · Pull Request #67 · cuga-project/cuga-agent

haroldship · 2026-03-19T09:47:59Z

Bug Fix Pull Request

Related Issue

Fixes #66

Description

Handles OutputParserException and other exceptions in find_tools when the shortlister LLM returns empty or malformed JSON, preventing the agent from crashing mid-conversation.

Type of Changes

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix that would cause existing functionality to not work as expected)

Root Cause

When the shortlister LLM returns empty or invalid JSON (e.g., due to rate limits, model errors, or unexpected output), chain.ainvoke raises an OutputParserException that was unhandled, causing find_tools to fail and the agent to crash.

Solution

Wrap the chain.ainvoke call in a try/except block that catches exceptions and logs a warning
On failure, fall back to returning all available tools unfiltered via a new _format_all_tools_as_fallback helper method
This ensures the agent can still proceed with all tools available rather than crashing

Testing

I have tested this fix locally
I have added tests that prove my fix works
All new and existing tests passed
I have verified the bug no longer occurs

Checklist

My code follows the code style of this project
I have performed a self-review of my own code
I have made corresponding changes to the documentation if needed

Summary by CodeRabbit

Bug Fixes
- Tool shortlisting now fails gracefully: if automatic selection errors, users receive a readable fallback markdown list of available tools instead of an error.
- Fallback lists up to 20 tools, include names and optional descriptions, and truncate long documentation for conciseness; system warnings (including query length and error type) are logged for diagnostics.

…returns empty JSON When the shortlister model returns invalid/empty JSON (e.g. intermittent WatsonX failures), find_tools now catches the exception and falls back to returning all available tools unfiltered, allowing the agent to continue instead of surfacing a cryptic OutputParserException.

coderabbitai · 2026-03-19T09:48:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8244804b-6f51-491b-afc0-b9b61becf847

📥 Commits

Reviewing files that changed from the base of the PR and between c391ad6 and be91744.

📒 Files selected for processing (1)

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

📝 Walkthrough

Walkthrough

Added guarded error handling around the LLM shortlisting call in PromptUtils.find_tools. On exception the function logs a warning (including query length and exception) and returns a formatted markdown fallback list of tools. A new static helper _format_all_tools_as_fallback creates that fallback output.

Changes

Cohort / File(s)	Summary
Prompt utils (error handling & fallback) `src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`	Wrapped the LLM shortlisting invocation (`chain.ainvoke(...)`) in `try/except`; on error log a warning with query length and exception (stack trace enabled) and return a markdown fallback listing. Added `PromptUtils._format_all_tools_as_fallback(all_tools: List[StructuredTool]) -> str` to enumerate up to 20 tools, include descriptions when present, and attempt per-tool doc extraction via `PromptUtils.get_tool_docs` with truncation and error handling.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I nudged the chain when prompts went astray,
If LLM trips, I’ll show the tray;
Names and notes in tidy mark,
A rabbit’s fallback lights the dark—
Tools in a row until the day.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: handling exceptions in find_tools when the shortlister LLM fails, which directly addresses the root cause of the bug.
Linked Issues check	✅ Passed	The code changes fully address issue `#66` by wrapping the chain.ainvoke call in try/except, logging warnings on failure, and providing a fallback tool listing to prevent OutputParserException from crashing the agent.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to handling the exception in find_tools and providing fallback functionality; no unrelated modifications were introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/jsondecoder-exception

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`:
- Around line 266-267: The warning currently logs the raw user query (variable
"query") on failure; change the log to avoid sensitive content by removing the
full query and instead include safe metadata (e.g., len(query) and/or a hash),
the exception type/message (from "e") and the traceback; update the
logger.warning call near the Tool shortlisting failure to log something like
"Tool shortlisting failed for query (length=%d): %s" using len(query) and str(e)
and include exc_info=True so the traceback is recorded rather than the raw
prompt.
- Around line 381-402: The fallback that builds a full markdown of all tools can
produce extremely large output (context overflow); in the method that uses
all_tools and PromptUtils.get_tool_docs (the block that builds markdown_lines
and returns "\n".join(markdown_lines)), limit the size by truncating either the
number of tools and/or per-tool documentation: implement a MAX_TOOLS (e.g., 20)
and/or MAX_CHARS_PER_TOOL, iterate only up to MAX_TOOLS, and for each
params_doc/response_doc trim to MAX_CHARS_PER_TOOL with an appended
"...(truncated)"; also add a final note in the markdown indicating how many
tools were omitted and that the docs were truncated so callers know results are
partial. Ensure these constants are configurable and refer to the same symbols
(all_tools, PromptUtils.get_tool_docs, markdown_lines) so callers downstream get
bounded output.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f12e2bd3-f7d7-4659-aa05-136afbe614eb

📥 Commits

Reviewing files that changed from the base of the PR and between 7cc28d9 and 6a4a08d.

📒 Files selected for processing (1)

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

- Remove raw query from warning log to avoid leaking sensitive user data; log query_len and error_type with full traceback instead - Cap fallback tool listing to 20 tools with 1200-char doc limit to prevent context-overflow in downstream LLM calls

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`:
- Line 403: Wrap the call to PromptUtils.get_tool_docs(tool) in a defensive
try/except so the fallback path cannot raise; if PromptUtils.get_tool_docs(tool)
throws (e.g., due to non-serializable schema), catch the exception, optionally
log it, and set params_doc and response_doc to safe defaults (empty strings or
minimal docs) before continuing with the fallback. Ensure you update the block
that assigns params_doc, response_doc so the names are always defined even on
error, referencing PromptUtils.get_tool_docs and the local variables params_doc
and response_doc.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fae98c08-59bb-47ff-9170-36b771614041

📥 Commits

Reviewing files that changed from the base of the PR and between 6a4a08d and c391ad6.

📒 Files selected for processing (1)

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

Wrap PromptUtils.get_tool_docs() in a try/except within _format_all_tools_as_fallback so a non-serializable tool schema cannot crash the fallback path that is meant to keep the agent running.

sami-marreed · 2026-03-19T19:31:28Z

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

+                }
+            )
+        except Exception as e:
+            logger.bind(


What is the reason this could fail here?

The reason it failed for OP was that the shortlister returned None and the JSON Decoder threw an OutputParserException. This is passed through to here, where this code will catch it and any other excpeptions. This is a sample stack trace:

Error during execution: OutputParserException('Invalid json output: \nFor troubleshooting, visit: https://docs.langchain.com/oss/python/langchain/errors/OUTPUT_PARSING_FAILURE ') Traceback (most recent call last): File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/output_parsers/json.py", line 84, in parse_result return parse_json_markdown(text) File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 164, in parse_json_markdown return _parse_json(json_str, parser=parser) File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 194, in _parse_json return parser(json_str) File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 137, in parse_partial_json return json.loads(s, strict=strict) ~~~~~~~~~~^^^^^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/__init__.py", line 365, in loads return cls(**kw).decode(s) ~~~~~~~~~~~~~~~~^^^ File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/decoder.py", line 345, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/decoder.py", line 363, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/code_executor.py", line 121, in eval_with_tools_async result = await executor.execute( ^^^^^^^^^^^^^^^^^^^^^^^ ...<3 lines>... ) ^ File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/local/local_executor.py", line 80, in execute result_locals = await asyncio.wait_for(async_main(), timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/asyncio/tasks.py", line 507, in wait_for return await fut ^^^^^^^^^ File "<string>", line 5, in _async_main File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py", line 121, in wrapper_with_pydantic result = await func(*args, **kwargs) if inspect.iscoroutinefunction(func) else func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py", line 375, in find_tools_func return await PromptUtils.find_tools( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ query=query, all_tools=filtered_tools, all_apps=filtered_apps, llm=llm

haroldship requested a review from sami-marreed March 19, 2026 09:50

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py Outdated Show resolved Hide resolved

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py Show resolved Hide resolved

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py Outdated Show resolved Hide resolved

fix: guard fallback doc extraction against exceptions

be91744

Wrap PromptUtils.get_tool_docs() in a try/except within _format_all_tools_as_fallback so a non-serializable tool schema cannot crash the fallback path that is meant to keep the agent running.

sami-marreed reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cuga-lite): handle exception in find_tools when shortlister LLM fails#67

fix(cuga-lite): handle exception in find_tools when shortlister LLM fails#67
haroldship wants to merge 3 commits intomainfrom
fix/jsondecoder-exception

haroldship commented Mar 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

sami-marreed Mar 19, 2026

Uh oh!

haroldship Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

haroldship commented Mar 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug Fix Pull Request

Related Issue

Description

Type of Changes

Root Cause

Solution

Testing

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sami-marreed Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

haroldship Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haroldship commented Mar 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 19, 2026 •

edited

Loading