test: add e2e tests for wire mode #707

stdrc · 2026-01-26T04:59:58Z

Signed-off-by: Richard Chien [email protected]

Related Issue

Resolve #(issue_number)

Description

Checklist

I have read the CONTRIBUTING document.
I have linked the related issue, if any.
I have added tests that prove my fix is effective or that my feature works.
I have run make gen-changelog to update the changelog.
I have run make gen-docs to update the user documentation.

Signed-off-by: Richard Chien <[email protected]>

devin-ai-integration

Devin Review found 2 potential issues.

View issues and 6 additional flags in Devin Review.

tests_e2e/wire_helpers.py

devin-ai-integration · 2026-01-26T05:02:47Z

tests_e2e/wire_helpers.py

+
+TRACE_ENV = "KIMI_TEST_TRACE"
+DEFAULT_TIMEOUT = 5.0
+_PATH_REPLACEMENTS: dict[str, str] = {}


🟡 Global _PATH_REPLACEMENTS dict accumulates state across tests without cleanup

The module-level _PATH_REPLACEMENTS dictionary accumulates path replacement mappings across all tests without ever being cleared. This can cause test interference when running multiple tests in the same process.

Click to expand

Issue Details

At line 17, a global mutable dictionary is defined:

_PATH_REPLACEMENTS: dict[str, str] = {}

Every call to make_home_dir() or make_work_dir() (lines 30-41) registers new path replacements via register_path_replacements() which calls _add_replacement() (lines 62-67):

def _add_replacement(path: Path | None, token: str) -> None: if path is None: return _PATH_REPLACEMENTS[str(path)] = token # Never cleared! resolved = path.resolve() _PATH_REPLACEMENTS[str(resolved)] = token

Impact

Since pytest runs tests in the same process, this dictionary grows unboundedly. More critically, in _replace_paths() (line 336), replacements are sorted by length and applied in order - if a path from a previous test happens to be a substring of a path in the current test, incorrect replacements could occur, leading to flaky test failures or incorrect snapshot comparisons.

Recommendation: Either clear _PATH_REPLACEMENTS at the start of each test (e.g., via a pytest fixture or by clearing it in make_home_dir/make_work_dir), or use a context-local approach instead of a global dictionary.

Was this helpful? React with 👍 or 👎 to provide feedback.

Copilot

Pull request overview

Adds a new tests_e2e suite to exercise kimi --wire (JSON-RPC + wire event stream) end-to-end, covering protocol handshake, prompts, approvals/tools, sessions, skills, and MCP behavior.

Changes:

Introduces tests_e2e helpers and multiple wire-mode E2E test modules using _scripted_echo plus snapshots.
Wires the new E2E suite into make test-kimi-cli and expands lint/type-check coverage to include tests_e2e.
Adds a tests_e2e/AGENTS.md guide documenting the wire E2E test matrix and scope.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests_e2e/wire_helpers.py	Adds subprocess-based harness for driving `kimi --wire`, plus normalization/summarization utilities.
tests_e2e/test_wire_protocol.py	Adds tests for initialize handshake and external tool registration/calls.
tests_e2e/test_wire_prompt.py	Adds tests for prompt flows, content parts input, max steps, and concurrency error behavior.
tests_e2e/test_wire_errors.py	Adds tests for JSON-RPC and LLM-related error handling in wire mode.
tests_e2e/test_wire_sessions.py	Adds tests around session persistence, `--continue`, `/clear`, and `/compact`.
tests_e2e/test_wire_approvals_tools.py	Adds extensive tests for approval flows and display blocks.
tests_e2e/test_wire_skills_mcp.py	Adds tests for skills/flows and MCP tool invocation via a local stdio server.
tests_e2e/test_wire_real_llm.py	Adds placeholder tests (skipped) for behaviors requiring a real provider.
tests_e2e/AGENTS.md	Documents E2E goals, rules, and a wire test matrix.
tests_e2e/init.py	Declares the new test package.
pyproject.toml	Adds Ruff per-file ignores for tests/tests_e2e and includes tests_e2e in type-checking inputs.
Makefile	Runs `pytest tests_e2e` as part of `make test-kimi-cli`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-26T05:05:13Z

tests_e2e/wire_helpers.py

+    )
+    selector = selectors.DefaultSelector()
+    assert process.stdout is not None
+    selector.register(process.stdout, selectors.EVENT_READ)
+    return WireProcess(process=process, selector=selector)


selectors.DefaultSelector() + selector.register(process.stdout, ...) is not portable on Windows (CI includes windows-2022). The default selector on Windows only supports sockets, so registering a pipe/file handle for process.stdout can raise an exception and break all e2e tests. Consider replacing this with a cross-platform approach (e.g., a background thread that reads stdout lines into a queue.Queue and get(timeout=...), or asyncio stream reader).

Copilot · 2026-01-26T05:05:13Z

tests_e2e/test_wire_protocol.py

+        assert normalize_response(resp) == snapshot(
+            {
+                "result": {
+                    "protocol_version": "1.1",
+                    "server": {"name": "Kimi CLI", "version": "0.84"},


The initialize snapshot hard-codes the server version as 0.84, but the repo's current package version is 0.87 (see pyproject.toml). This will cause the snapshot to fail whenever the package version changes. Prefer asserting server.name and that server.version is a non-empty string, or normalize/strip the version field before snapshotting.

Suggested change

assert normalize_response(resp) == snapshot(

{

"result": {

"protocol_version": "1.1",

"server": {"name": "Kimi CLI", "version": "0.84"},

# Validate server info without depending on the exact version string.

server = _as_dict(result.get("server"))

assert server.get("name") == "Kimi CLI"

assert isinstance(server.get("version"), str) and server["version"]

# Normalize the response for snapshotting by stripping the concrete version.

normalized = normalize_response(resp)

normalized_result = _as_dict(normalized.get("result"))

normalized_server = _as_dict(normalized_result.get("server"))

if "version" in normalized_server:

normalized_server["version"] = "<version>"

assert normalized == snapshot(

{

"result": {

"protocol_version": "1.1",

"server": {"name": "Kimi CLI", "version": "<version>"},

Copilot · 2026-01-26T05:05:13Z

tests_e2e/test_wire_protocol.py

+                "result": {
+                    "protocol_version": "1.1",
+                    "server": {"name": "Kimi CLI", "version": "0.84"},
+                    "slash_commands": [


Same issue as above: this snapshot hard-codes server.version (0.84), which will drift from the actual package version (pyproject.toml currently has 0.87) and make the test brittle. Consider removing server.version from the snapshot or dynamically asserting it.

Copilot · 2026-01-26T05:05:14Z

tests_e2e/test_wire_sessions.py

+def test_clear_context_rotates(tmp_path) -> None:
+    config_path = write_scripted_config(tmp_path, ["text: hello"])
+    work_dir = make_work_dir(tmp_path)
+    home_dir = make_home_dir(tmp_path)
+
+    wire = start_wire(


The test name suggests context file rotation, but the assertions expect the context file to be truncated and no rotated context.jsonl.* files (rotated == []). Rename the test to reflect the actual behavior (e.g., test_clear_context_truncates) or adjust the assertions to match the intended semantics.

Copilot · 2026-01-26T05:05:14Z

tests_e2e/wire_helpers.py

+    def close(self) -> None:
+        if self.process.stdin is not None:
+            with contextlib.suppress(Exception):
+                self.process.stdin.close()
+        if self.process.stdout is not None:
+            with contextlib.suppress(Exception):
+                self.process.stdout.close()
+        try:


WireProcess.close() closes stdin/stdout and the subprocess, but never closes/unregisters the selector (self.selector). This can leak file descriptors across many e2e tests. Close the selector (and/or unregister stdout) as part of cleanup.

Signed-off-by: Richard Chien <[email protected]>

e2e tests for wire mode

70d4fde

Signed-off-by: Richard Chien <[email protected]>

Copilot AI review requested due to automatic review settings January 26, 2026 04:59

Copilot started reviewing on behalf of stdrc January 26, 2026 05:00 View session

devin-ai-integration bot reviewed Jan 26, 2026

View reviewed changes

Copilot AI reviewed Jan 26, 2026

View reviewed changes

stdrc added 2 commits January 26, 2026 13:13

try fix

a12abad

Signed-off-by: Richard Chien <[email protected]>

fix tests

d1ef369

Signed-off-by: Richard Chien <[email protected]>

stdrc merged commit bcacefc into main Jan 26, 2026
28 checks passed

stdrc deleted the rc/wire-e2e branch January 26, 2026 06:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add e2e tests for wire mode #707

test: add e2e tests for wire mode #707

stdrc commented Jan 26, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

devin-ai-integration bot Jan 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 26, 2026

Uh oh!

Copilot AI Jan 26, 2026

Uh oh!

Copilot AI Jan 26, 2026

Uh oh!

Copilot AI Jan 26, 2026

Uh oh!

Copilot AI Jan 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        assert normalize_response(resp) == snapshot(
-            {
-                "result": {
-                    "protocol_version": "1.1",
-                    "server": {"name": "Kimi CLI", "version": "0.84"},
+        # Validate server info without depending on the exact version string.
+        server = _as_dict(result.get("server"))
+        assert server.get("name") == "Kimi CLI"
+        assert isinstance(server.get("version"), str) and server["version"]
+        # Normalize the response for snapshotting by stripping the concrete version.
+        normalized = normalize_response(resp)
+        normalized_result = _as_dict(normalized.get("result"))
+        normalized_server = _as_dict(normalized_result.get("server"))
+        if "version" in normalized_server:
+            normalized_server["version"] = "<version>"
+        assert normalized == snapshot(
+            {
+                "result": {
+                    "protocol_version": "1.1",
+                    "server": {"name": "Kimi CLI", "version": "<version>"},

test: add e2e tests for wire mode #707

test: add e2e tests for wire mode #707

Conversation

stdrc commented Jan 26, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Description

Checklist

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devin-ai-integration bot Jan 26, 2026

Choose a reason for hiding this comment

Issue Details

Impact

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stdrc commented Jan 26, 2026 •

edited by devin-ai-integration bot

Loading