Bug Description
After PR #4511 (adding support for streaming modes), the LangGraph integration started leaking tool-call outputs into the LLM input stream. Tool outputs are being forwarded as chat chunks and sent back to the model.
It is due to incorrect handling in the _to_chat_chunk function by myself. The current implementation assumes tool outputs have user-visible content and streams them, which was not the intended behavior.
Expected Behavior
The LangGraph chunk handler should properly distinguish between user-visible model output and dedicated tool-call inputs/outputs. Tool outputs should not be streamed back to the LLM as chat content.
I will submit a follow-up PR shortly to fix this behavior. Apologies for the confusion caused by the earlier change.
Reproduction Steps
1. Starts session with LangGraph integration.
2. Trigger any tool call
3. Observe tool output streamed.
...
- Sample code snippet, or a GitHub Gist link -
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city."""
weather = random.choice(["sunny", "cloudy", "rainy", "snowy"])
temp = random.randint(-5, 35)
return f"{city}: {weather}, {temp}C"
@tool
def get_time() -> str:
"""Get the current time."""
return datetime.now().strftime("%H:%M")
# -- LangGraph agent --
model = ChatGoogleGenerativeAI(model="gemini-2.5-flash")
tools = [get_weather, get_time]
graph = create_agent(
model=model,
tools=tools,
system_prompt="You are a helpful voice assistant. Keep responses to 1-2 sentences.",
)
# -- LiveKit voice agent --
server = AgentServer()
@server.rtc_session()
async def voice_session(ctx: agents.JobContext):
session = AgentSession(
stt=openai.STT(model="gpt-4o-transcribe"),
tts=openai.TTS(voice="marin"),
llm=LLMAdapter(graph=graph, stream_mode="messages"),
vad=silero.VAD.load(),
)
await session.start(
room=ctx.room,
agent=Agent(instructions=""),
)
if __name__ == "__main__":
agents.cli.run_app(server)
Operating System
macOS Tahoe
Models Used
gpt-4o-transcribe/gemini-2.5-flash/gpt-4o-mini-tts
Package Versions
Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response
Bug Description
After PR #4511 (adding support for streaming modes), the LangGraph integration started leaking tool-call outputs into the LLM input stream. Tool outputs are being forwarded as chat chunks and sent back to the model.
It is due to incorrect handling in the _to_chat_chunk function by myself. The current implementation assumes tool outputs have user-visible content and streams them, which was not the intended behavior.
Expected Behavior
The LangGraph chunk handler should properly distinguish between user-visible model output and dedicated tool-call inputs/outputs. Tool outputs should not be streamed back to the LLM as chat content.
I will submit a follow-up PR shortly to fix this behavior. Apologies for the confusion caused by the earlier change.
Reproduction Steps
Operating System
macOS Tahoe
Models Used
gpt-4o-transcribe/gemini-2.5-flash/gpt-4o-mini-tts
Package Versions
Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response