Skip to content

fix(cuga-lite): handle exception in find_tools when shortlister LLM fails#67

Open
haroldship wants to merge 3 commits intomainfrom
fix/jsondecoder-exception
Open

fix(cuga-lite): handle exception in find_tools when shortlister LLM fails#67
haroldship wants to merge 3 commits intomainfrom
fix/jsondecoder-exception

Conversation

@haroldship
Copy link
Collaborator

@haroldship haroldship commented Mar 19, 2026

Bug Fix Pull Request

Related Issue

Fixes #66

Description

Handles OutputParserException and other exceptions in find_tools when the shortlister LLM returns empty or malformed JSON, preventing the agent from crashing mid-conversation.

Type of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix that would cause existing functionality to not work as expected)

Root Cause

When the shortlister LLM returns empty or invalid JSON (e.g., due to rate limits, model errors, or unexpected output), chain.ainvoke raises an OutputParserException that was unhandled, causing find_tools to fail and the agent to crash.

Solution

  • Wrap the chain.ainvoke call in a try/except block that catches exceptions and logs a warning
  • On failure, fall back to returning all available tools unfiltered via a new _format_all_tools_as_fallback helper method
  • This ensures the agent can still proceed with all tools available rather than crashing

Testing

  • I have tested this fix locally
  • I have added tests that prove my fix works
  • All new and existing tests passed
  • I have verified the bug no longer occurs

Checklist

  • My code follows the code style of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if needed

Summary by CodeRabbit

  • Bug Fixes
    • Tool shortlisting now fails gracefully: if automatic selection errors, users receive a readable fallback markdown list of available tools instead of an error.
    • Fallback lists up to 20 tools, include names and optional descriptions, and truncate long documentation for conciseness; system warnings (including query length and error type) are logged for diagnostics.

…returns empty JSON

  When the shortlister model returns invalid/empty JSON (e.g. intermittent
  WatsonX failures), find_tools now catches the exception and falls back
  to returning all available tools unfiltered, allowing the agent to
  continue instead of surfacing a cryptic OutputParserException.
@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8244804b-6f51-491b-afc0-b9b61becf847

📥 Commits

Reviewing files that changed from the base of the PR and between c391ad6 and be91744.

📒 Files selected for processing (1)
  • src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

📝 Walkthrough

Walkthrough

Added guarded error handling around the LLM shortlisting call in PromptUtils.find_tools. On exception the function logs a warning (including query length and exception) and returns a formatted markdown fallback list of tools. A new static helper _format_all_tools_as_fallback creates that fallback output.

Changes

Cohort / File(s) Summary
Prompt utils (error handling & fallback)
src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py
Wrapped the LLM shortlisting invocation (chain.ainvoke(...)) in try/except; on error log a warning with query length and exception (stack trace enabled) and return a markdown fallback listing. Added PromptUtils._format_all_tools_as_fallback(all_tools: List[StructuredTool]) -> str to enumerate up to 20 tools, include descriptions when present, and attempt per-tool doc extraction via PromptUtils.get_tool_docs with truncation and error handling.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I nudged the chain when prompts went astray,
If LLM trips, I’ll show the tray;
Names and notes in tidy mark,
A rabbit’s fallback lights the dark—
Tools in a row until the day.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: handling exceptions in find_tools when the shortlister LLM fails, which directly addresses the root cause of the bug.
Linked Issues check ✅ Passed The code changes fully address issue #66 by wrapping the chain.ainvoke call in try/except, logging warnings on failure, and providing a fallback tool listing to prevent OutputParserException from crashing the agent.
Out of Scope Changes check ✅ Passed All changes are directly scoped to handling the exception in find_tools and providing fallback functionality; no unrelated modifications were introduced.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/jsondecoder-exception
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@haroldship haroldship requested a review from sami-marreed March 19, 2026 09:50
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`:
- Around line 266-267: The warning currently logs the raw user query (variable
"query") on failure; change the log to avoid sensitive content by removing the
full query and instead include safe metadata (e.g., len(query) and/or a hash),
the exception type/message (from "e") and the traceback; update the
logger.warning call near the Tool shortlisting failure to log something like
"Tool shortlisting failed for query (length=%d): %s" using len(query) and str(e)
and include exc_info=True so the traceback is recorded rather than the raw
prompt.
- Around line 381-402: The fallback that builds a full markdown of all tools can
produce extremely large output (context overflow); in the method that uses
all_tools and PromptUtils.get_tool_docs (the block that builds markdown_lines
and returns "\n".join(markdown_lines)), limit the size by truncating either the
number of tools and/or per-tool documentation: implement a MAX_TOOLS (e.g., 20)
and/or MAX_CHARS_PER_TOOL, iterate only up to MAX_TOOLS, and for each
params_doc/response_doc trim to MAX_CHARS_PER_TOOL with an appended
"...(truncated)"; also add a final note in the markdown indicating how many
tools were omitted and that the docs were truncated so callers know results are
partial. Ensure these constants are configurable and refer to the same symbols
(all_tools, PromptUtils.get_tool_docs, markdown_lines) so callers downstream get
bounded output.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f12e2bd3-f7d7-4659-aa05-136afbe614eb

📥 Commits

Reviewing files that changed from the base of the PR and between 7cc28d9 and 6a4a08d.

📒 Files selected for processing (1)
  • src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

  - Remove raw query from warning log to avoid leaking sensitive user data;
    log query_len and error_type with full traceback instead
  - Cap fallback tool listing to 20 tools with 1200-char doc limit to
    prevent context-overflow in downstream LLM calls
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`:
- Line 403: Wrap the call to PromptUtils.get_tool_docs(tool) in a defensive
try/except so the fallback path cannot raise; if PromptUtils.get_tool_docs(tool)
throws (e.g., due to non-serializable schema), catch the exception, optionally
log it, and set params_doc and response_doc to safe defaults (empty strings or
minimal docs) before continuing with the fallback. Ensure you update the block
that assigns params_doc, response_doc so the names are always defined even on
error, referencing PromptUtils.get_tool_docs and the local variables params_doc
and response_doc.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fae98c08-59bb-47ff-9170-36b771614041

📥 Commits

Reviewing files that changed from the base of the PR and between 6a4a08d and c391ad6.

📒 Files selected for processing (1)
  • src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py

  Wrap PromptUtils.get_tool_docs() in a try/except within
  _format_all_tools_as_fallback so a non-serializable tool schema
  cannot crash the fallback path that is meant to keep the agent running.
}
)
except Exception as e:
logger.bind(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason this could fail here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it failed for OP was that the shortlister returned None and the JSON Decoder threw an OutputParserException. This is passed through to here, where this code will catch it and any other excpeptions. This is a sample stack trace:

Error during execution: OutputParserException('Invalid json output: \nFor troubleshooting, visit: https://docs.langchain.com/oss/python/langchain/errors/OUTPUT_PARSING_FAILURE ')
Traceback (most recent call last):
  File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/output_parsers/json.py", line 84, in parse_result
    return parse_json_markdown(text)
  File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 164, in parse_json_markdown
    return _parse_json(json_str, parser=parser)
  File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 194, in _parse_json
    return parser(json_str)
  File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 137, in parse_partial_json
    return json.loads(s, strict=strict)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/__init__.py", line 365, in loads
    return cls(**kw).decode(s)
           ~~~~~~~~~~~~~~~~^^^
  File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/decoder.py", line 345, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/decoder.py", line 363, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/code_executor.py", line 121, in eval_with_tools_async
    result = await executor.execute(
             ^^^^^^^^^^^^^^^^^^^^^^^
    ...<3 lines>...
    )
    ^
  File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/local/local_executor.py", line 80, in execute
    result_locals = await asyncio.wait_for(async_main(), timeout=timeout)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/asyncio/tasks.py", line 507, in wait_for
    return await fut
           ^^^^^^^^^
  File "<string>", line 5, in _async_main
  File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py", line 121, in wrapper_with_pydantic
    result = await func(*args, **kwargs) if inspect.iscoroutinefunction(func) else func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py", line 375, in find_tools_func
    return await PromptUtils.find_tools(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        query=query, all_tools=filtered_tools, all_apps=filtered_apps, llm=llm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: json OutputParserException

2 participants