Skip to content

ADK integration: LLM spans stuck at 'started' when context caching is enabled (SSE streaming) #5524

@lupuletic

Description

@lupuletic

Description

When using Opik's ADK integration (OpikTracer + track_adk_agent_recursive) with Google ADK's ContextCacheConfig enabled, LLM spans are never finalized — they remain at _OPIK_SPAN_STATUS: started with no output or usage data.

The issue only occurs in SSE streaming mode (Cloud Run / production). Locally with InMemorySessionService and no streaming, everything works correctly.

Environment

  • opik==1.10.25
  • google-adk==1.26.0
  • Python 3.12
  • Cloud Run (SSE streaming mode)
  • Vertex AI (GOOGLE_GENAI_USE_VERTEXAI=True)

Reproduction

Setup

from google.adk.agents import Agent
from google.adk.agents.context_cache_config import ContextCacheConfig
from google.adk.apps import App
from opik.integrations.adk import OpikTracer, track_adk_agent_recursive

root_agent = Agent(
    name="my_agent",
    model=Gemini(model="gemini-3-flash-preview"),
    # ... tools, callbacks, etc.
)

tracer = OpikTracer(name="my-agent", project_name="my-project")
track_adk_agent_recursive(root_agent, tracer)

app = App(
    root_agent=root_agent,
    name="app",
    # This breaks Opik LLM spans:
    context_cache_config=ContextCacheConfig(
        min_tokens=2048,
        ttl_seconds=1800,
    ),
)

Steps

  1. Deploy to Cloud Run with SSE streaming enabled
  2. Send a message that triggers tool use (so there are 2 LLM calls)
  3. Check Opik traces — LLM spans show _OPIK_SPAN_STATUS: started, no output, no usage

Expected

LLM spans should show _OPIK_SPAN_STATUS: ready_for_finalization with output and usage data.

Actual

All LLM spans stuck at started. Cloud Run logs show repeated:

OPIK: No current span found in context for model output update

This warning comes from opik_tracer.py:280-284 where context_storage.top_span_data() returns None.

Root Cause Analysis

Opik uses contextvars.ContextVar for span tracking (OpikContextStorage). The before_model_callback pushes span data via context_storage.add_span_data(), and after_model_callback retrieves it via context_storage.top_span_data().

When ContextCacheConfig is enabled, ADK's GeminiContextCacheManager creates OTel spans inside the LLM generation flow:

# google_llm.py:175
with tracer.start_as_current_span('handle_context_caching') as span:
    cache_manager = GeminiContextCacheManager(self.api_client)
    cache_metadata = await cache_manager.handle_context_caching(llm_request)

And inside the cache manager:

# gemini_context_cache_manager.py:361
with tracer.start_as_current_span("create_cache") as span:
    cached_content = await self.genai_client.aio.caches.create(...)

These OTel start_as_current_span context managers, combined with SSE streaming (async generators + PROGRESSIVE_SSE_STREAMING), cause the ContextVar state set by before_model_callback to be invisible when after_model_callback runs inside the streaming generator's execution context.

Evidence

We tested systematically on staging:

Deploy Context Caching LLM Span Status
Branch without context caching OFF ready_for_finalization
Main with context caching ON started
Branch with context caching removed OFF ready_for_finalization

Same opik version (1.10.25) and ADK version (1.26.0) across all deploys.

Locally (no SSE streaming), context caching does NOT break Opik — both LLM spans finalize correctly.

Workaround

Disable context caching:

app = App(
    root_agent=root_agent,
    name="app",
    # context_cache_config=ContextCacheConfig(...)  # disabled
)

Suggested Fix

The contextvars approach for span tracking is fragile with async generators and OTel span context managers. Possible fixes:

  1. Use span ID tracking instead of contextvars stack: Store spans in a dict keyed by a correlation ID that's passed through the callback arguments, rather than relying on ContextVar stack ordering
  2. Copy context explicitly: When creating the streaming generator, explicitly copy the current contextvars context so span data is preserved
  3. Fallback mechanism in after_model_callback: When top_span_data() returns None, try to find the span by model name or other metadata before giving up

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions