ADK integration: LLM spans stuck at 'started' when context caching is enabled (SSE streaming)

## Description

When using Opik's ADK integration (`OpikTracer` + `track_adk_agent_recursive`) with Google ADK's `ContextCacheConfig` enabled, LLM spans are never finalized — they remain at `_OPIK_SPAN_STATUS: started` with no output or usage data.

The issue only occurs in **SSE streaming mode** (Cloud Run / production). Locally with `InMemorySessionService` and no streaming, everything works correctly.

## Environment

- `opik==1.10.25`
- `google-adk==1.26.0`
- Python 3.12
- Cloud Run (SSE streaming mode)
- Vertex AI (`GOOGLE_GENAI_USE_VERTEXAI=True`)

## Reproduction

### Setup

```python
from google.adk.agents import Agent
from google.adk.agents.context_cache_config import ContextCacheConfig
from google.adk.apps import App
from opik.integrations.adk import OpikTracer, track_adk_agent_recursive

root_agent = Agent(
    name="my_agent",
    model=Gemini(model="gemini-3-flash-preview"),
    # ... tools, callbacks, etc.
)

tracer = OpikTracer(name="my-agent", project_name="my-project")
track_adk_agent_recursive(root_agent, tracer)

app = App(
    root_agent=root_agent,
    name="app",
    # This breaks Opik LLM spans:
    context_cache_config=ContextCacheConfig(
        min_tokens=2048,
        ttl_seconds=1800,
    ),
)
```

### Steps

1. Deploy to Cloud Run with SSE streaming enabled
2. Send a message that triggers tool use (so there are 2 LLM calls)
3. Check Opik traces — LLM spans show `_OPIK_SPAN_STATUS: started`, no output, no usage

### Expected

LLM spans should show `_OPIK_SPAN_STATUS: ready_for_finalization` with output and usage data.

### Actual

All LLM spans stuck at `started`. Cloud Run logs show repeated:
```
OPIK: No current span found in context for model output update
```

This warning comes from `opik_tracer.py:280-284` where `context_storage.top_span_data()` returns `None`.

## Root Cause Analysis

Opik uses `contextvars.ContextVar` for span tracking (`OpikContextStorage`). The `before_model_callback` pushes span data via `context_storage.add_span_data()`, and `after_model_callback` retrieves it via `context_storage.top_span_data()`.

When `ContextCacheConfig` is enabled, ADK's `GeminiContextCacheManager` creates **OTel spans** inside the LLM generation flow:

```python
# google_llm.py:175
with tracer.start_as_current_span('handle_context_caching') as span:
    cache_manager = GeminiContextCacheManager(self.api_client)
    cache_metadata = await cache_manager.handle_context_caching(llm_request)
```

And inside the cache manager:

```python
# gemini_context_cache_manager.py:361
with tracer.start_as_current_span("create_cache") as span:
    cached_content = await self.genai_client.aio.caches.create(...)
```

These OTel `start_as_current_span` context managers, combined with **SSE streaming** (async generators + `PROGRESSIVE_SSE_STREAMING`), cause the `ContextVar` state set by `before_model_callback` to be invisible when `after_model_callback` runs inside the streaming generator's execution context.

## Evidence

We tested systematically on staging:

| Deploy | Context Caching | LLM Span Status |
|--------|----------------|-----------------|
| Branch without context caching | OFF | `ready_for_finalization` ✅ |
| Main with context caching | ON | `started` ❌ |
| Branch with context caching removed | OFF | `ready_for_finalization` ✅ |

Same opik version (`1.10.25`) and ADK version (`1.26.0`) across all deploys.

Locally (no SSE streaming), context caching does NOT break Opik — both LLM spans finalize correctly.

## Workaround

Disable context caching:
```python
app = App(
    root_agent=root_agent,
    name="app",
    # context_cache_config=ContextCacheConfig(...)  # disabled
)
```

## Suggested Fix

The `contextvars` approach for span tracking is fragile with async generators and OTel span context managers. Possible fixes:

1. **Use span ID tracking instead of contextvars stack**: Store spans in a dict keyed by a correlation ID that's passed through the callback arguments, rather than relying on `ContextVar` stack ordering
2. **Copy context explicitly**: When creating the streaming generator, explicitly copy the current `contextvars` context so span data is preserved
3. **Fallback mechanism in `after_model_callback`**: When `top_span_data()` returns `None`, try to find the span by model name or other metadata before giving up

## Related

- #5374 (OpikADKOtelTracer replaces global TracerProvider)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADK integration: LLM spans stuck at 'started' when context caching is enabled (SSE streaming) #5524

Description

Environment

Reproduction

Setup

Steps

Expected

Actual

Root Cause Analysis

Evidence

Workaround

Suggested Fix

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Deploy	Context Caching	LLM Span Status
Branch without context caching	OFF	`ready_for_finalization` ✅
Main with context caching	ON	`started` ❌
Branch with context caching removed	OFF	`ready_for_finalization` ✅

ADK integration: LLM spans stuck at 'started' when context caching is enabled (SSE streaming) #5524

Description

Description

Environment

Reproduction

Setup

Steps

Expected

Actual

Root Cause Analysis

Evidence

Workaround

Suggested Fix

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions