Skip to content

Bug: merge_dicts concatenates ls_provider across streaming chunks, inflating response_metadata #36993

@GiovaneIwamoto

Description

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

Reproduction Steps / Example Code (Python)

from langchain_core.utils._merge import merge_dicts

# Simulate two streaming chunks from ChatBedrockConverse (langchain-aws>=1.4.5).
# Each chunk carries ls_provider="amazon_bedrock" in response_metadata.
left = {"ls_provider": "amazon_bedrock", "model_provider": "bedrock_converse"}
right = {"ls_provider": "amazon_bedrock", "model_provider": "bedrock_converse"}
result = merge_dicts(left, right)

print("model_provider:", result["model_provider"])  # "bedrock_converse" (correct)
print("ls_provider:", result["ls_provider"])         # "amazon_bedrockamazon_bedrock" (BUG)

# After 20 streaming chunks (typical response):
metadata = {"ls_provider": "amazon_bedrock"}
for _ in range(19):
    metadata = merge_dicts(metadata, {"ls_provider": "amazon_bedrock"})
print(f"After 20 chunks: {len(metadata['ls_provider'])} chars")  # 280 instead of 14

Error Message and Stack Trace (if applicable)

No explicit error is raised. The reproduction above shows that ls_provider grows
to N * len("amazon_bedrock") after N streaming chunks. In our production
environment (LangGraph agent, 58+ graph steps, langchain-aws==1.4.5), we observed
massively inflated ls_provider fields across all AIMessages in conversation history,
contributing to request payload bloat.

Description

langchain-aws v1.4.5 (PR langchain-ai/langchain-aws#981) adds response_metadata["ls_provider"] = "amazon_bedrock" to each streaming chunk in ChatBedrockConverse._stream(). This is needed for SummarizationMiddleware to identify the provider from message metadata.

However, merge_dicts() concatenates string values when merging chunks. The skip-concatenation guard on line 59 of _merge.py protects model_provider (also set on every chunk) but does not protect ls_provider:

right_k in {"id", "output_version", "model_provider"}  # ls_provider missing

After merging ~20 chunks, ls_provider becomes "amazon_bedrock" repeated 20 times (280 chars). Every AIMessage in conversation history carries this inflated metadata.

Proposed fix: add "ls_provider" to the existing guard, matching the pattern already established for "model_provider". I have a fix ready with tests if assigned.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 25.4.0: Thu Mar 19 19:33:09 PDT 2026; root:xnu-12377.101.15~1/RELEASE_ARM64_T8112
Python Version: 3.14.3 (main, Feb 3 2026, 15:32:20) [Clang 17.0.0 (clang-1700.6.3.2)]

Package Information

langchain_core: 1.3.2
langsmith: 0.7.36
langchain_protocol: 0.0.11

Optional packages not installed

deepagents
deepagents-cli

Other Dependencies

httpx: 0.28.1
jsonpatch: 1.33
orjson: 3.11.8
packaging: 26.1
pydantic: 2.13.3
pytest: 9.0.3
pyyaml: 6.0.3
requests: 2.33.1
requests-toolbelt: 1.0.0
tenacity: 9.1.4
typing-extensions: 4.15.0
uuid-utils: 0.14.1
xxhash: 3.6.0
zstandard: 0.25.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugRelated to a bug, vulnerability, unexpected error with an existing featurecore`langchain-core` package issues & PRsexternal

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions