fix(cuga-lite): handle exception in find_tools when shortlister LLM fails#67
fix(cuga-lite): handle exception in find_tools when shortlister LLM fails#67haroldship wants to merge 3 commits intomainfrom
Conversation
…returns empty JSON When the shortlister model returns invalid/empty JSON (e.g. intermittent WatsonX failures), find_tools now catches the exception and falls back to returning all available tools unfiltered, allowing the agent to continue instead of surfacing a cryptic OutputParserException.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdded guarded error handling around the LLM shortlisting call in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`:
- Around line 266-267: The warning currently logs the raw user query (variable
"query") on failure; change the log to avoid sensitive content by removing the
full query and instead include safe metadata (e.g., len(query) and/or a hash),
the exception type/message (from "e") and the traceback; update the
logger.warning call near the Tool shortlisting failure to log something like
"Tool shortlisting failed for query (length=%d): %s" using len(query) and str(e)
and include exc_info=True so the traceback is recorded rather than the raw
prompt.
- Around line 381-402: The fallback that builds a full markdown of all tools can
produce extremely large output (context overflow); in the method that uses
all_tools and PromptUtils.get_tool_docs (the block that builds markdown_lines
and returns "\n".join(markdown_lines)), limit the size by truncating either the
number of tools and/or per-tool documentation: implement a MAX_TOOLS (e.g., 20)
and/or MAX_CHARS_PER_TOOL, iterate only up to MAX_TOOLS, and for each
params_doc/response_doc trim to MAX_CHARS_PER_TOOL with an appended
"...(truncated)"; also add a final note in the markdown indicating how many
tools were omitted and that the docs were truncated so callers know results are
partial. Ensure these constants are configurable and refer to the same symbols
(all_tools, PromptUtils.get_tool_docs, markdown_lines) so callers downstream get
bounded output.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f12e2bd3-f7d7-4659-aa05-136afbe614eb
📒 Files selected for processing (1)
src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py
- Remove raw query from warning log to avoid leaking sensitive user data;
log query_len and error_type with full traceback instead
- Cap fallback tool listing to 20 tools with 1200-char doc limit to
prevent context-overflow in downstream LLM calls
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py`:
- Line 403: Wrap the call to PromptUtils.get_tool_docs(tool) in a defensive
try/except so the fallback path cannot raise; if PromptUtils.get_tool_docs(tool)
throws (e.g., due to non-serializable schema), catch the exception, optionally
log it, and set params_doc and response_doc to safe defaults (empty strings or
minimal docs) before continuing with the fallback. Ensure you update the block
that assigns params_doc, response_doc so the names are always defined even on
error, referencing PromptUtils.get_tool_docs and the local variables params_doc
and response_doc.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fae98c08-59bb-47ff-9170-36b771614041
📒 Files selected for processing (1)
src/cuga/backend/cuga_graph/nodes/cuga_lite/prompt_utils.py
Wrap PromptUtils.get_tool_docs() in a try/except within _format_all_tools_as_fallback so a non-serializable tool schema cannot crash the fallback path that is meant to keep the agent running.
| } | ||
| ) | ||
| except Exception as e: | ||
| logger.bind( |
There was a problem hiding this comment.
What is the reason this could fail here?
There was a problem hiding this comment.
The reason it failed for OP was that the shortlister returned None and the JSON Decoder threw an OutputParserException. This is passed through to here, where this code will catch it and any other excpeptions. This is a sample stack trace:
Error during execution: OutputParserException('Invalid json output: \nFor troubleshooting, visit: https://docs.langchain.com/oss/python/langchain/errors/OUTPUT_PARSING_FAILURE ')
Traceback (most recent call last):
File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/output_parsers/json.py", line 84, in parse_result
return parse_json_markdown(text)
File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 164, in parse_json_markdown
return _parse_json(json_str, parser=parser)
File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 194, in _parse_json
return parser(json_str)
File "/root/proj/cuga-internal-evaluation/.venv/lib/python3.13/site-packages/langchain_core/utils/json.py", line 137, in parse_partial_json
return json.loads(s, strict=strict)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/__init__.py", line 365, in loads
return cls(**kw).decode(s)
~~~~~~~~~~~~~~~~^^^
File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/json/decoder.py", line 363, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/code_executor.py", line 121, in eval_with_tools_async
result = await executor.execute(
^^^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
)
^
File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/local/local_executor.py", line 80, in execute
result_locals = await asyncio.wait_for(async_main(), timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/asyncio/tasks.py", line 507, in wait_for
return await fut
^^^^^^^^^
File "<string>", line 5, in _async_main
File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py", line 121, in wrapper_with_pydantic
result = await func(*args, **kwargs) if inspect.iscoroutinefunction(func) else func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/proj/cuga-internal-evaluation/vendor/cuga-agent/src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py", line 375, in find_tools_func
return await PromptUtils.find_tools(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
query=query, all_tools=filtered_tools, all_apps=filtered_apps, llm=llm
Bug Fix Pull Request
Related Issue
Fixes #66
Description
Handles
OutputParserExceptionand other exceptions infind_toolswhen the shortlister LLM returns empty or malformed JSON, preventing the agent from crashing mid-conversation.Type of Changes
Root Cause
When the shortlister LLM returns empty or invalid JSON (e.g., due to rate limits, model errors, or unexpected output),
chain.ainvokeraises anOutputParserExceptionthat was unhandled, causingfind_toolsto fail and the agent to crash.Solution
chain.ainvokecall in a try/except block that catches exceptions and logs a warning_format_all_tools_as_fallbackhelper methodTesting
Checklist
Summary by CodeRabbit