Skip to content

Conversation

@stdrc
Copy link
Collaborator

@stdrc stdrc commented Jan 26, 2026

Signed-off-by: Richard Chien [email protected]

Related Issue

Resolve #(issue_number)

Description

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue, if any.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run make gen-changelog to update the changelog.
  • I have run make gen-docs to update the user documentation.

Open with Devin

Signed-off-by: Richard Chien <[email protected]>
Copilot AI review requested due to automatic review settings January 26, 2026 04:59
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View issues and 6 additional flags in Devin Review.

Open in Devin Review


TRACE_ENV = "KIMI_TEST_TRACE"
DEFAULT_TIMEOUT = 5.0
_PATH_REPLACEMENTS: dict[str, str] = {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Global _PATH_REPLACEMENTS dict accumulates state across tests without cleanup

The module-level _PATH_REPLACEMENTS dictionary accumulates path replacement mappings across all tests without ever being cleared. This can cause test interference when running multiple tests in the same process.

Click to expand

Issue Details

At line 17, a global mutable dictionary is defined:

_PATH_REPLACEMENTS: dict[str, str] = {}

Every call to make_home_dir() or make_work_dir() (lines 30-41) registers new path replacements via register_path_replacements() which calls _add_replacement() (lines 62-67):

def _add_replacement(path: Path | None, token: str) -> None:
    if path is None:
        return
    _PATH_REPLACEMENTS[str(path)] = token  # Never cleared!
    resolved = path.resolve()
    _PATH_REPLACEMENTS[str(resolved)] = token

Impact

Since pytest runs tests in the same process, this dictionary grows unboundedly. More critically, in _replace_paths() (line 336), replacements are sorted by length and applied in order - if a path from a previous test happens to be a substring of a path in the current test, incorrect replacements could occur, leading to flaky test failures or incorrect snapshot comparisons.

Recommendation: Either clear _PATH_REPLACEMENTS at the start of each test (e.g., via a pytest fixture or by clearing it in make_home_dir/make_work_dir), or use a context-local approach instead of a global dictionary.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new tests_e2e suite to exercise kimi --wire (JSON-RPC + wire event stream) end-to-end, covering protocol handshake, prompts, approvals/tools, sessions, skills, and MCP behavior.

Changes:

  • Introduces tests_e2e helpers and multiple wire-mode E2E test modules using _scripted_echo plus snapshots.
  • Wires the new E2E suite into make test-kimi-cli and expands lint/type-check coverage to include tests_e2e.
  • Adds a tests_e2e/AGENTS.md guide documenting the wire E2E test matrix and scope.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests_e2e/wire_helpers.py Adds subprocess-based harness for driving kimi --wire, plus normalization/summarization utilities.
tests_e2e/test_wire_protocol.py Adds tests for initialize handshake and external tool registration/calls.
tests_e2e/test_wire_prompt.py Adds tests for prompt flows, content parts input, max steps, and concurrency error behavior.
tests_e2e/test_wire_errors.py Adds tests for JSON-RPC and LLM-related error handling in wire mode.
tests_e2e/test_wire_sessions.py Adds tests around session persistence, --continue, /clear, and /compact.
tests_e2e/test_wire_approvals_tools.py Adds extensive tests for approval flows and display blocks.
tests_e2e/test_wire_skills_mcp.py Adds tests for skills/flows and MCP tool invocation via a local stdio server.
tests_e2e/test_wire_real_llm.py Adds placeholder tests (skipped) for behaviors requiring a real provider.
tests_e2e/AGENTS.md Documents E2E goals, rules, and a wire test matrix.
tests_e2e/init.py Declares the new test package.
pyproject.toml Adds Ruff per-file ignores for tests/tests_e2e and includes tests_e2e in type-checking inputs.
Makefile Runs pytest tests_e2e as part of make test-kimi-cli.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 228 to 232
)
selector = selectors.DefaultSelector()
assert process.stdout is not None
selector.register(process.stdout, selectors.EVENT_READ)
return WireProcess(process=process, selector=selector)
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selectors.DefaultSelector() + selector.register(process.stdout, ...) is not portable on Windows (CI includes windows-2022). The default selector on Windows only supports sockets, so registering a pipe/file handle for process.stdout can raise an exception and break all e2e tests. Consider replacing this with a cross-platform approach (e.g., a background thread that reads stdout lines into a queue.Queue and get(timeout=...), or asyncio stream reader).

Copilot uses AI. Check for mistakes.
Comment on lines 41 to 45
assert normalize_response(resp) == snapshot(
{
"result": {
"protocol_version": "1.1",
"server": {"name": "Kimi CLI", "version": "0.84"},
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialize snapshot hard-codes the server version as 0.84, but the repo's current package version is 0.87 (see pyproject.toml). This will cause the snapshot to fail whenever the package version changes. Prefer asserting server.name and that server.version is a non-empty string, or normalize/strip the version field before snapshotting.

Suggested change
assert normalize_response(resp) == snapshot(
{
"result": {
"protocol_version": "1.1",
"server": {"name": "Kimi CLI", "version": "0.84"},
# Validate server info without depending on the exact version string.
server = _as_dict(result.get("server"))
assert server.get("name") == "Kimi CLI"
assert isinstance(server.get("version"), str) and server["version"]
# Normalize the response for snapshotting by stripping the concrete version.
normalized = normalize_response(resp)
normalized_result = _as_dict(normalized.get("result"))
normalized_server = _as_dict(normalized_result.get("server"))
if "version" in normalized_server:
normalized_server["version"] = "<version>"
assert normalized == snapshot(
{
"result": {
"protocol_version": "1.1",
"server": {"name": "Kimi CLI", "version": "<version>"},

Copilot uses AI. Check for mistakes.
Comment on lines 105 to 108
"result": {
"protocol_version": "1.1",
"server": {"name": "Kimi CLI", "version": "0.84"},
"slash_commands": [
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: this snapshot hard-codes server.version (0.84), which will drift from the actual package version (pyproject.toml currently has 0.87) and make the test brittle. Consider removing server.version from the snapshot or dynamically asserting it.

Copilot uses AI. Check for mistakes.
Comment on lines +143 to +148
def test_clear_context_rotates(tmp_path) -> None:
config_path = write_scripted_config(tmp_path, ["text: hello"])
work_dir = make_work_dir(tmp_path)
home_dir = make_home_dir(tmp_path)

wire = start_wire(
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name suggests context file rotation, but the assertions expect the context file to be truncated and no rotated context.jsonl.* files (rotated == []). Rename the test to reflect the actual behavior (e.g., test_clear_context_truncates) or adjust the assertions to match the intended semantics.

Copilot uses AI. Check for mistakes.
Comment on lines 173 to 180
def close(self) -> None:
if self.process.stdin is not None:
with contextlib.suppress(Exception):
self.process.stdin.close()
if self.process.stdout is not None:
with contextlib.suppress(Exception):
self.process.stdout.close()
try:
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WireProcess.close() closes stdin/stdout and the subprocess, but never closes/unregisters the selector (self.selector). This can leak file descriptors across many e2e tests. Close the selector (and/or unregister stdout) as part of cleanup.

Copilot uses AI. Check for mistakes.
stdrc added 2 commits January 26, 2026 13:13
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
@stdrc stdrc merged commit bcacefc into main Jan 26, 2026
28 checks passed
@stdrc stdrc deleted the rc/wire-e2e branch January 26, 2026 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants