feat(desktop): native desktop automation MCP tool (closes #189) by RajdeepKushwaha5 · Pull Request #633 · accomplish-ai/accomplish

RajdeepKushwaha5 · 2026-02-22T18:51:38Z

Summary

Adds desktop-control, a new MCP tool that gives the agent seven native desktop
automation capabilities beyond the browser, working across macOS, Windows, and
Linux with zero compiled native Node.js dependencies.

Tools

Tool	Description	Permission
`desktop_screenshot`	Capture primary display as PNG	None (read-only)
`desktop_mouse_click`	Click (left/right/double) at screen coordinates	✅
`desktop_mouse_move`	Move cursor to screen coordinates	✅
`desktop_type_text`	Type Unicode text at caret	✅
`desktop_key_press`	Press key combos (`ctrl+c`, `Return`, `F5`, …)	✅
`desktop_list_windows`	List visible windows as JSON	None (read-only)
`desktop_focus_window`	Bring application to foreground	✅

Security design

Permission fails closed: every catch {} in the permission check returns
false. The server never defaults to "allow" when the permission API is
unreachable (contrast: competing PR feat: add native desktop automation MCP tool #606 catch { return true }).
execFile() everywhere: no exec() or shell template strings with user
input embedded. All user-controlled values are passed as discrete typed
arguments or out-of-band via temp files.
Injection-proof typing: desktop_type_text uses pbcopy stdin on macOS,
PowerShell clipboard on Windows, and xdotool type -- text on Linux — the
text being typed is never interpolated into any command string.
AppleScript escaping: window names passed to osascript are sanitised
through escapeAppleScriptString().

Platform implementation

macOS (built-in tools only, except mouse_move):

screencapture -x → reads file → base64 PNG (actual PNG bytes, not raw pixels)
osascript + System Events for click, keystroke, key code, window management
mouse_move: requires cliclick (brew install cliclick), documented in SKILL.md

Windows (built-in PowerShell only):

Parameterised .ps1 scripts — output path / arguments passed as typed
PowerShell parameters, never string-interpolated
System.Windows.Forms for screenshot + clipboard; P/Invoke for mouse simulation

Linux (xdotool + scrot):

scrot with fallback to gnome-screenshot
xdotool for mouse, keyboard, and window management

What's different from existing PRs

Issue	PR #431	PR #572	PR #606	This PR
Permission fails closed	—	—	❌ (`return true` on error)	✅
No shell injection	❌	❌	❌	✅ `execFile` only
Correct screenshot format	❌	❌ raw pixels	✅	✅
macOS listWindows	—	❌ newline bug	✅	✅ JXA JSON
Windows bounds/coords	—	❌ PID as x	—	✅
express in right deps	—	❌ devDeps	✅	n/a (no express)

Integration

Registered in config-generator.ts alongside the other MCP tools, with
PERMISSION_API_PORT forwarded in environment
Added to SKILLS_FULLY_BUNDLED and bundles in bundle-skills.cjs
"Desktop Automation" capability added to agent system prompt
Tests: requiredMcpDistEntries updated; new assertions verify the server is
registered with correct permission port in both default and custom scenarios

Testing

-pnpm typecheck ✅
-pnpm lint:eslint ✅ (exit 0)
-vitest run [config-generator.test.ts]- ✅ 57 tests, 55 pass, 2 pre-existing Windows path-separator failures unrelated to this PR

Summary by CodeRabbit

New Features
- Desktop automation tools (screenshots, mouse/keyboard, window list/focus), a local desktop-control MCP server now included in the desktop bundle, and integrated desktop permission endpoints with UI flow.
Documentation
- Added user-facing desktop-control documentation covering platforms, permissions, command formats, and workflows.
Bug Fixes
- Improved CLI invocation and executable path handling on Windows; suppresses console windows and avoids quoting issues.
Tests
- Added tests for Windows CLI/path handling and for the new server/config behavior.

orcaman · 2026-02-22T18:51:53Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scanner	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues
✅	Licenses	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

coderabbitai · 2026-02-22T18:51:58Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new local MCP tool "desktop-control" for cross-platform desktop automation, includes it in bundling and generated MCP servers, replaces shell exec calls with execFileSync for CLI/version resolution, adjusts Windows PTY spawn logic and tests, and extends the desktop permission API with a /desktop-permission endpoint.

Changes

Cohort / File(s)	Summary
Desktop-Control MCP `packages/agent-core/mcp-tools/desktop-control/SKILL.md`, `packages/agent-core/mcp-tools/desktop-control/package.json`, `packages/agent-core/mcp-tools/desktop-control/tsconfig.json`, `packages/agent-core/mcp-tools/desktop-control/src/index.ts`	New MCP tool "desktop-control": cross-platform handlers (screenshot, mouse, keyboard, window list/focus), per-tool schemas, permission checks, and Stdio MCP server implementation.
Bundling & Config `apps/desktop/scripts/bundle-skills.cjs`, `packages/agent-core/src/opencode/config-generator.ts`, `packages/agent-core/tests/unit/opencode/config-generator.test.ts`	Include `desktop-control` in fully bundled skills and generated MCP servers; add PERMISSION_API_PORT env and 60s timeout; update tests to expect new server and port behavior.
Permission API (Electron) `apps/desktop/src/main/permission-api.ts`, `apps/desktop/src/main/opencode/electron-options.ts`	Add `/desktop-permission` POST endpoint, synthesize UI-compatible permission requests for desktop ops, validate desktop-specific payloads; change CLI invocation to use execFileSync with windowsHide.
CLI/version resolution `packages/agent-core/src/opencode/cli-resolver.ts`, `apps/desktop/src/main/opencode/electron-options.ts`, `packages/agent-core/tests/unit/opencode/cli-resolver.test.ts`	Replace shell-based execSync with execFileSync and explicit `["--version"]` arg, add windowsHide/timeout; add tests ensuring non-existent paths with spaces return null safely.
Windows spawn / OpenCode adapter & tests `packages/agent-core/src/internal/classes/OpenCodeAdapter.ts`, `packages/agent-core/tests/unit/opencode/adapter.test.ts`	Clarify Windows branch to directly spawn `.exe` with `{file, args}`, add Windows-focused regression tests covering spaces, Unicode, and shell-sensitive characters (duplicate test blocks present).

Sequence Diagram

sequenceDiagram
    participant Agent as Agent/Client
    participant MCP as Desktop-Control MCP
    participant PermAPI as Permission API
    participant OS as OS Platform

    Agent->>MCP: CallToolRequest (desktop action)
    MCP->>PermAPI: Request desktop permission (operation, metadata)
    alt permission granted
        PermAPI-->>MCP: Granted
        MCP->>OS: Execute platform-specific command (AppleScript / PowerShell / xdotool)
        OS-->>MCP: Result (image/base64, status, window list)
        MCP-->>Agent: Tool result
    else denied or API error
        PermAPI-->>MCP: Denied/Error
        MCP-->>Agent: Error response
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: Unify OpenCode runtime setup across dev and packaged builds #588: Overlaps OpenCode runtime resolution and bundling changes affecting electron-options, cli-resolver, and bundle-skills.
fix(agent-core): use direct Windows spawn for .exe #548: Related Windows PTY spawn/quoting logic changes in OpenCodeAdapter and resolver.

Suggested reviewers

mcmatan
galElmalah
DanielScharfsteinGit

Poem

🐇 I tiptoe across the glowing screen,
I ask for leave to click and glean,
I paste and move and capture light,
I wait for nods, then do it right,
A cheeky rabbit, automating bright.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding a native desktop automation MCP tool with cross-platform support, which aligns with the comprehensive changes introducing desktop-control capabilities.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR introduces a new desktop-control MCP tool that provides native desktop automation capabilities across macOS, Windows, and Linux, while also fixing Issue #596 related to Windows path handling for paths containing spaces.

Changes:

Adds desktop-control MCP tool with 7 desktop automation capabilities (screenshot, mouse control, keyboard input, window management)
Fixes Issue #596 by replacing execSync with execFileSync in cli-resolver.ts and electron-options.ts
Updates OpenCodeAdapter to route Windows commands through cmd.exe /s /c with proper quoting
Adds comprehensive unit tests for Windows path handling with spaces

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
packages/agent-core/mcp-tools/desktop-control/src/index.ts	New MCP server implementing 7 desktop automation tools with security-first design
packages/agent-core/mcp-tools/desktop-control/package.json	Package configuration for desktop-control tool
packages/agent-core/mcp-tools/desktop-control/tsconfig.json	TypeScript configuration matching other MCP tools
packages/agent-core/mcp-tools/desktop-control/SKILL.md	Comprehensive documentation for AI agent usage
packages/agent-core/src/opencode/config-generator.ts	Registers desktop-control MCP server and adds capability to system prompt
packages/agent-core/src/opencode/cli-resolver.ts	Replaces execSync with execFileSync to fix Issue #596
packages/agent-core/src/internal/classes/OpenCodeAdapter.ts	Routes Windows .exe through cmd.exe with proper quoting for Issue #596
apps/desktop/src/main/opencode/electron-options.ts	Applies same execFileSync fix for consistency
apps/desktop/scripts/bundle-skills.cjs	Adds desktop-control to bundling configuration
packages/agent-core/tests/unit/opencode/config-generator.test.ts	Adds tests verifying desktop-control registration
packages/agent-core/tests/unit/opencode/cli-resolver.test.ts	Adds regression test for Issue #596
packages/agent-core/tests/unit/opencode/adapter.test.ts	Comprehensive tests for Windows path escaping

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-22T18:58:53Z

packages/agent-core/mcp-tools/desktop-control/src/index.ts

+const transport = new StdioServerTransport();
+await server.connect(transport);


The server startup code should follow the established pattern used by all other MCP tools in this codebase. Wrap the transport creation and server.connect() call in a main() async function with .catch() error handling. This ensures consistent error handling across all MCP tools and provides better error messages when the server fails to start. All other MCP tools (file-permission, ask-user-question, complete-task, start-task, report-checkpoint, report-thought, dev-browser-mcp) follow this pattern.

Suggested change

const transport = new StdioServerTransport();

await server.connect(transport);

async function main(): Promise<void> {

const transport = new StdioServerTransport();

await server.connect(transport);

}

main().catch((error) => {

console.error('Failed to start desktop-control MCP server:', error);

process.exit(1);

});

Copilot · 2026-02-22T18:58:53Z

packages/agent-core/mcp-tools/desktop-control/src/index.ts

+  [DllImport("user32.dll")] public static extern bool SetForegroundWindow(IntPtr h);
+}
+'@
+$proc = Get-Process | Where-Object {$_.MainWindowTitle -like "*$Title*"} | Select-Object -First 1


The Windows focusWindow implementation uses PowerShell's -like operator with wildcards ($Title), which will treat special characters in the window name as wildcards (e.g., *, ?, [, ]). Consider using -eq for exact matching or escaping the wildcard characters in $Title to avoid unexpected matching behavior. For example, a window name containing brackets like "Test[1]" might not match correctly.

Suggested change

$proc = Get-Process | Where-Object {$_.MainWindowTitle -like "*$Title*"} | Select-Object -First 1

$escapedTitle = [System.Management.Automation.WildcardPattern]::Escape($Title)

$proc = Get-Process | Where-Object {$_.MainWindowTitle -like "*$escapedTitle*"} | Select-Object -First 1

Copilot · 2026-02-22T18:58:53Z

packages/agent-core/mcp-tools/desktop-control/SKILL.md

+## Tools
+
+### `desktop_screenshot`
+Capture the entire primary display. Returns a base64-encoded PNG image. No permission required.


The desktop_screenshot tool captures the full contents of the primary display and is explicitly documented as requiring no permission, even though this can expose highly sensitive on‑screen information (passwords, 2FA codes, private documents, etc.). In a prompt‑injection or compromised orchestrator scenario, an attacker could invoke this tool repeatedly to exfiltrate whatever is visible on the user’s screen without any explicit, per‑action user approval. Consider aligning screenshots with the permission model (e.g., using the same permission API or a dedicated "allow screen capture" prompt) so users must explicitly approve each capture before the image is returned.

Suggested change

Capture the entire primary display. Returns a base64-encoded PNG image. No permission required.

Capture the entire primary display. Returns a base64-encoded PNG image. Permission required; each capture must be explicitly approved by the user (via the same permission model used for other desktop tools).

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

packages/agent-core/mcp-tools/desktop-control/package.json (1)

12-16: Consider moving @modelcontextprotocol/sdk to dependencies.

The @modelcontextprotocol/sdk package is likely used at runtime (imported in src/index.ts to create the MCP server). Placing it in devDependencies means it won't be installed in production builds or when this package is used as a dependency.

Since this tool is bundled (per bundle-skills.cjs), the bundler may resolve it regardless. However, for clarity and correctness, runtime dependencies should be in dependencies.
Proposed fix
-  "devDependencies": {
-    "@modelcontextprotocol/sdk": "^1.0.0",
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "^1.0.0"
+  },
+  "devDependencies": {
     "tsx": "^4.21.0",
     "typescript": "^5.0.0"
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/package.json` around lines 12 -
16, The package "@modelcontextprotocol/sdk" is declared in devDependencies but
is required at runtime (imported in src/index.ts to create the MCP server) so
move it into "dependencies" in package.json; update the package.json
devDependencies/dev block to remove "@modelcontextprotocol/sdk" and add it under
"dependencies" (keeping the same version "^1.0.0") so production installs and
consumers get the runtime dependency, and ensure any bundler config
(bundle-skills.cjs) still works with the dependency present.

packages/agent-core/mcp-tools/desktop-control/src/index.ts (2)

355-360: Replace the nested ternary with explicit branching for key mapping.

♻️ Suggested refactor

-  const k =
-    mainKey in WIN_KEYS
-      ? WIN_KEYS[mainKey]
-      : mainKey.length === 1
-        ? mainKey
-        : `{${mainKey.toUpperCase()}}`;
+  let k: string;
+  if (mainKey in WIN_KEYS) {
+    k = WIN_KEYS[mainKey];
+  } else if (mainKey.length === 1) {
+    k = mainKey;
+  } else {
+    k = `{${mainKey.toUpperCase()}}`;
+  }

As per coding guidelines: Avoid nested ternaries — Use mapper objects or if/else for readability.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 355
- 360, The expression that computes k uses a nested ternary which reduces
readability; replace it with explicit branching using either an if/else block or
a small helper mapper so the logic is clear: first check WIN_KEYS for mainKey
(WIN_KEYS[mainKey]), then if not present check whether mainKey.length === 1 and
return mainKey, otherwise return the fallback `"{"+mainKey.toUpperCase()+"}"`;
update the code that assigns k (the constant computed from mainKey and WIN_KEYS)
and any caller that relies on that value to use the clearer branching
implementation.

137-142: Replace the nested ternary with a mapping table for readability.

♻️ Suggested refactor

-    const cmd =
-      button === 'right'
-        ? 'right click'
-        : button === 'double'
-          ? 'double click'
-          : 'click';
+    const CLICK_CMD: Record<'left' | 'right' | 'double', string> = {
+      left: 'click',
+      right: 'right click',
+      double: 'double click',
+    };
+    const cmd = CLICK_CMD[button];

As per coding guidelines: Avoid nested ternaries — Use mapper objects or if/else for readability.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 137
- 142, The nested ternary that sets cmd based on button should be replaced with
a simple mapping object or switch to improve readability: create a const mapping
(e.g., const clickMap = { right: 'right click', double: 'double click', default:
'click' } or similar) and use button to look up the value for cmd, falling back
to 'click' when not present; update the code that assigns cmd (the variable
named cmd that currently uses the nested ternary) to use this mapper instead.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 431-447: The Linux branch currently sets pid to the X11 window ID
(wids) instead of the process ID; update the mapping in the wids -> results
Promise.all block to call execFile('xdotool', ['getwindowpid', wid]) to retrieve
the real PID, parse it to a number, and set pid to undefined when
retrieval/parsing fails (keep the existing try/catch so windows without
_NET_WM_PID are handled). Locate the async map over wids (the block that
currently calls execFile('xdotool', ['getwindowname', wid]) and returns { name:
..., pid: Number(wid) }) and replace the pid logic to use getwindowpid and
conditional undefined on failure, preserving WindowInfo typing and filtering out
nulls as before.
- Around line 459-472: The PowerShell script written via tmpFile('focus', 'ps1')
and writeFile can return success even when no window matches because Write-Error
is non-terminating; update the here-doc body (the script string passed to
writeFile) so the else branch both writes the error and then calls exit 1 (e.g.,
after Write-Error "No window found matching: $Title" add exit 1) so the process
returns a non-zero exit code and execFile will surface the failure; locate the
ps1 script content where WinFocus and SetForegroundWindow are defined and modify
only the else branch to propagate the error.

---

Nitpick comments:
In `@packages/agent-core/mcp-tools/desktop-control/package.json`:
- Around line 12-16: The package "@modelcontextprotocol/sdk" is declared in
devDependencies but is required at runtime (imported in src/index.ts to create
the MCP server) so move it into "dependencies" in package.json; update the
package.json devDependencies/dev block to remove "@modelcontextprotocol/sdk" and
add it under "dependencies" (keeping the same version "^1.0.0") so production
installs and consumers get the runtime dependency, and ensure any bundler config
(bundle-skills.cjs) still works with the dependency present.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 355-360: The expression that computes k uses a nested ternary
which reduces readability; replace it with explicit branching using either an
if/else block or a small helper mapper so the logic is clear: first check
WIN_KEYS for mainKey (WIN_KEYS[mainKey]), then if not present check whether
mainKey.length === 1 and return mainKey, otherwise return the fallback
`"{"+mainKey.toUpperCase()+"}"`; update the code that assigns k (the constant
computed from mainKey and WIN_KEYS) and any caller that relies on that value to
use the clearer branching implementation.
- Around line 137-142: The nested ternary that sets cmd based on button should
be replaced with a simple mapping object or switch to improve readability:
create a const mapping (e.g., const clickMap = { right: 'right click', double:
'double click', default: 'click' } or similar) and use button to look up the
value for cmd, falling back to 'click' when not present; update the code that
assigns cmd (the variable named cmd that currently uses the nested ternary) to
use this mapper instead.

packages/agent-core/mcp-tools/desktop-control/src/index.ts

galElmalah · 2026-02-24T09:24:26Z

@RajdeepKushwaha5 can you attach a short video demonstrating these new capabilities?

…e paths with spaces On Windows, pty.spawn() with a raw .exe path works in ConPTY mode (Windows 10 1809+) but fails in WinPTY fallback mode when the installation path contains spaces. The intermediate cmd.exe session receives an unquoted path and splits it at every space, producing the error: '"C:\Users\Anish Maheshwari\...\opencode.exe"' is not recognized Fix buildPtySpawnArgs() on win32 to route through cmd.exe /s /c using the existing buildShellCommand() helper which individually quotes each token that contains spaces (Windows cmd.exe doubling convention). Also replace the two execSync calls in cli-resolver.ts and electron-options.ts with execFileSync(path, ['--version']) to eliminate cmd.exe shell involvement from version detection entirely. Tests: added 6 regression tests in adapter.test.ts (Issue accomplish-ai#596 describe block) and 1 test in cli-resolver.test.ts verifying graceful handling of paths with spaces. Fixes: accomplish-ai#596

…sh-ai#189) Add a new 'desktop-control' MCP tool that gives the agent native desktop automation capabilities beyond CLI and browser, operating on all three supported platforms without requiring any compiled native Node.js modules. ## Tools exposed - desktop_screenshot - capture primary display as PNG (no permission needed) - desktop_mouse_click - left/right/double click at screen coordinates - desktop_mouse_move - move cursor to screen coordinates - desktop_type_text - type arbitrary Unicode text at caret position - desktop_key_press - press key combinations (ctrl+c, Return, F5, ...) - desktop_list_windows - list visible application windows as JSON - desktop_focus_window - bring an application to the foreground ## Security - Permission fails CLOSED: catch {} returns false, never returns true - execFile() used for every subprocess - no shell interpolation, no injection - type_text uses clipboard-paste technique (pbcopy on macOS, PS clipboard on Windows, xdotool type on Linux) so user text is never embedded in a command string - AppleScript strings with user input are escaped with escapeAppleScriptString() - key_press validates input with regex before building platform commands ## Cross-platform implementation macOS: screencapture (screenshot), osascript/System Events (mouse+keyboard+ window management). mouse_move requires cliclick (brew install cliclick). Windows: parameterised PS1 scripts with P/Invoke for mouse/keyboard, System.Windows.Forms for screenshot, clipboard and SendKeys. Linux: scrot/gnome-screenshot (screenshot), xdotool for everything else. ## Integration - Registered in config-generator.ts with PERMISSION_API_PORT env var - Added to bundle-skills.cjs SKILLS_FULLY_BUNDLED list and bundles array - Desktop Automation capability added to agent system prompt - Unit tests updated: stub dist entry added to requiredMcpDistEntries; asserts server is registered with correct PERMISSION_API_PORT Closes accomplish-ai#189

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

packages/agent-core/mcp-tools/desktop-control/src/index.ts (2)

459-472: Windows focusWindow reports success when no window matches.

Write-Error is a non-terminating PowerShell error — it does not set a non-zero exit code. PowerShell exits 0 regardless, so execFile resolves successfully and the tool returns "Focused window: <name>" even when nothing was focused. Add exit 1 to propagate the failure.

🐛 Proposed fix

-else { Write-Error "No window found matching: $Title" }
+else { Write-Error "No window found matching: $Title"; exit 1 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 459
- 472, The PowerShell script written via tmpFile('focus','ps1') and writeFile
always exits 0 because Write-Error is non-terminating; change the script so that
when no process matches the Title it writes the error and then calls exit 1 (or
throws) so the spawned execFile call returns a non-zero exit code and the
focusWindow flow does not report success erroneously. Locate the script
construction in index.ts (the ps1 temp file created for focus) and append an
explicit exit 1 in the else branch (or replace Write-Error with Write-Error;
exit 1) so failures propagate to the caller.

437-447: Linux pid field is set to the X11 window ID, not the process ID.

xdotool search returns window IDs (arbitrary X11 identifiers). Storing them in the pid field of WindowInfo is semantically wrong and will confuse callers who attempt to signal or inspect the process. Use xdotool getwindowpid <wid> to retrieve the real PID; leave pid undefined when retrieval fails (e.g., for windows without _NET_WM_PID).

🐛 Proposed fix

-      wids.map(async (wid) => {
-        try {
-          const { stdout: name } = await execFile('xdotool', ['getwindowname', wid]);
-          return { name: name.trim(), pid: Number(wid) };
-        } catch {
-          return null;
-        }
-      }),
+      wids.map(async (wid) => {
+        try {
+          const { stdout: name } = await execFile('xdotool', ['getwindowname', wid]);
+          let pid: number | undefined;
+          try {
+            const { stdout: pidOut } = await execFile('xdotool', ['getwindowpid', wid]);
+            const parsed = Number(pidOut.trim());
+            if (Number.isFinite(parsed)) { pid = parsed; }
+          } catch {
+            // window does not set _NET_WM_PID — leave pid undefined
+          }
+          return { name: name.trim(), pid };
+        } catch {
+          return null;
+        }
+      }),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 437
- 447, The code is currently treating X11 window IDs as process IDs by returning
pid: Number(wid) inside the wids mapping; instead call xdotool getwindowpid for
each wid to retrieve the real process ID and parse it to a number, and if that
command fails or returns non-numeric data leave pid undefined. Update the async
mapping in the Promise.all block (the lambda that calls execFile('xdotool',
['getwindowname', wid])) to run a second execFile('xdotool', ['getwindowpid',
wid]) after successfully getting the name, convert the stdout to Number when
valid, and return { name: name.trim(), pid: pidNumberOrUndefined } while still
returning null on failures so results.filter((r): r is WindowInfo => r !== null)
remains correct.

🧹 Nitpick comments (1)

packages/agent-core/tests/unit/opencode/adapter.test.ts (1)

167-272: escapeShellArgWin32Issue596 is a byte-for-byte duplicate of escapeShellArgWin32 from the Issue #354 block.

Both helper functions are identical, and buildShellCommandIssue596 is functionally the same as buildShellCommand. If the production escaping strategy changes, both sets of helpers need to be updated independently — silent drift risk. Consider hoisting a single shared helper to the outer describe('Shell escaping utilities') scope and referencing it from both inner blocks.

♻️ Suggested extraction

+  // Shared helper for both Issue `#354` and Issue `#596` tests
+  function escapeShellArgForCmd(arg: string): string {
+    if (arg.includes(' ') || arg.includes('"')) {
+      return `"${arg.replace(/"/g, '""')}"`;
+    }
+    return arg;
+  }

   describe('Windows cmd.exe /s /c quoting (Issue `#354`)', () => {
-    function escapeShellArgWin32(arg: string): string {
-      if (arg.includes(' ') || arg.includes('"')) {
-        return `"${arg.replace(/"/g, '""')}"`;
-      }
-      return arg;
-    }
-    function buildShellCommand(command: string, args: string[]): string {
-      const escapedCommand = escapeShellArgWin32(command);
-      const escapedArgs = args.map((arg) => escapeShellArgWin32(arg));
-      return [escapedCommand, ...escapedArgs].join(' ');
-    }
+    function buildShellCommand(command: string, args: string[]): string {
+      return [command, ...args].map(escapeShellArgForCmd).join(' ');
+    }
     ...
   });

   describe('Windows spawn via cmd.exe /s /c – Issue `#596` (paths with spaces)', () => {
-    function escapeShellArgWin32Issue596(arg: string): string {
-      if (arg.includes(' ') || arg.includes('"')) {
-        return `"${arg.replace(/"/g, '""')}"`;
-      }
-      return arg;
-    }
-    function buildShellCommandIssue596(command: string, args: string[]): string {
-      return [command, ...args].map(escapeShellArgWin32Issue596).join(' ');
-    }
+    function buildShellCommandIssue596(command: string, args: string[]): string {
+      return [command, ...args].map(escapeShellArgForCmd).join(' ');
+    }
     ...
   });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/tests/unit/opencode/adapter.test.ts` around lines 167 -
272, The test duplicates the escape logic (escapeShellArgWin32Issue596 /
buildShellCommandIssue596) already defined as escapeShellArgWin32 /
buildShellCommand; hoist a single shared helper into the outer describe (e.g. a
new "Shell escaping utilities" scope) and have both inner test blocks call that
shared escapeShellArgWin32 and buildShellCommand to eliminate byte-for-byte
duplication and prevent drift — update references in the Windows Issue `#596`
tests to use escapeShellArgWin32 and buildShellCommand (remove the duplicate
escapeShellArgWin32Issue596 / buildShellCommandIssue596 functions) so both test
suites rely on the same helpers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 74-125: captureScreenshot currently takes full-screen images
without any permission check; add a session-level consent gate and audit
logging: when captureScreenshot is invoked, call the existing checkPermission
(or a new checkSessionConsent) to verify a session-scoped flag (e.g.,
allowDesktopScreenshots) and, if not set, prompt once for session consent before
proceeding; if consent is denied, throw or return a denied CallToolResult and do
not take the screenshot. After a successful capture, emit a structured audit
entry (e.g., via processLogger or the module's telemetry API) recording
timestamp, requesting agent id/session id, and that a screenshot occurred.
Ensure the consent flag is stored for the session scope so subsequent
captureScreenshot calls in the same session skip prompting, and keep the
existing tmp file cleanup behavior (unlink) unchanged.
- Around line 36-38: The tmpFile function currently builds filenames using
Date.now(), which can collide under concurrent async calls; update tmpFile to
include cryptographic randomness instead of (or in addition to) Date.now()—for
example use crypto.randomBytes(...) or crypto.randomUUID() to produce a unique
suffix and incorporate it into the generated filename while still calling
join(tmpdir(), ...); locate the tmpFile function and replace the Date.now()
usage with a crypto-based identifier (ensure you import Node's crypto),
preserving the prefix/ext parameters and filename format.

---

Duplicate comments:
In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 459-472: The PowerShell script written via tmpFile('focus','ps1')
and writeFile always exits 0 because Write-Error is non-terminating; change the
script so that when no process matches the Title it writes the error and then
calls exit 1 (or throws) so the spawned execFile call returns a non-zero exit
code and the focusWindow flow does not report success erroneously. Locate the
script construction in index.ts (the ps1 temp file created for focus) and append
an explicit exit 1 in the else branch (or replace Write-Error with Write-Error;
exit 1) so failures propagate to the caller.
- Around line 437-447: The code is currently treating X11 window IDs as process
IDs by returning pid: Number(wid) inside the wids mapping; instead call xdotool
getwindowpid for each wid to retrieve the real process ID and parse it to a
number, and if that command fails or returns non-numeric data leave pid
undefined. Update the async mapping in the Promise.all block (the lambda that
calls execFile('xdotool', ['getwindowname', wid])) to run a second
execFile('xdotool', ['getwindowpid', wid]) after successfully getting the name,
convert the stdout to Number when valid, and return { name: name.trim(), pid:
pidNumberOrUndefined } while still returning null on failures so
results.filter((r): r is WindowInfo => r !== null) remains correct.

---

Nitpick comments:
In `@packages/agent-core/tests/unit/opencode/adapter.test.ts`:
- Around line 167-272: The test duplicates the escape logic
(escapeShellArgWin32Issue596 / buildShellCommandIssue596) already defined as
escapeShellArgWin32 / buildShellCommand; hoist a single shared helper into the
outer describe (e.g. a new "Shell escaping utilities" scope) and have both inner
test blocks call that shared escapeShellArgWin32 and buildShellCommand to
eliminate byte-for-byte duplication and prevent drift — update references in the
Windows Issue `#596` tests to use escapeShellArgWin32 and buildShellCommand
(remove the duplicate escapeShellArgWin32Issue596 / buildShellCommandIssue596
functions) so both test suites rely on the same helpers.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10c8dc0 and b40dea0.

📒 Files selected for processing (12)

apps/desktop/scripts/bundle-skills.cjs
apps/desktop/src/main/opencode/electron-options.ts
packages/agent-core/mcp-tools/desktop-control/SKILL.md
packages/agent-core/mcp-tools/desktop-control/package.json
packages/agent-core/mcp-tools/desktop-control/src/index.ts
packages/agent-core/mcp-tools/desktop-control/tsconfig.json
packages/agent-core/src/internal/classes/OpenCodeAdapter.ts
packages/agent-core/src/opencode/cli-resolver.ts
packages/agent-core/src/opencode/config-generator.ts
packages/agent-core/tests/unit/opencode/adapter.test.ts
packages/agent-core/tests/unit/opencode/cli-resolver.test.ts
packages/agent-core/tests/unit/opencode/config-generator.test.ts

🚧 Files skipped from review as they are similar to previous changes (4)

packages/agent-core/mcp-tools/desktop-control/tsconfig.json
packages/agent-core/src/internal/classes/OpenCodeAdapter.ts
packages/agent-core/mcp-tools/desktop-control/package.json
packages/agent-core/tests/unit/opencode/config-generator.test.ts

packages/agent-core/mcp-tools/desktop-control/src/index.ts

- Replace cmd.exe /s /c approach in buildPtySpawnArgs() with writing a temp .ps1 file and running powershell.exe -File, fixing crashes when task text contains double-quotes or other shell-special characters - Add /desktop-permission route to permission API server so desktop-control MCP tool can request permissions without triggering file-op validation - Update desktop-control PERMISSION_API_URL to use /desktop-permission - Update adapter tests to reflect new PowerShell spawn behaviour

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (4)

packages/agent-core/mcp-tools/desktop-control/src/index.ts (4)

38-40: tmpFile collision under concurrent calls — already flagged.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 38 -
40, The tmpFile function can produce filename collisions under concurrent calls
because it only uses Date.now(), so replace the uniqueness strategy in tmpFile
with a robust unique suffix (e.g., crypto.randomUUID() or crypto.randomBytes
hex, or use fs.mkdtemp/tmp library) to guarantee uniqueness; update the tmpFile
implementation (function name tmpFile) to append that random/UUID token (or use
a safe OS temporary-file API) to the generated filename and ensure the chosen
approach is cross-platform and collision-resistant.

461-474: Windows focusWindow Write-Error leaves exit code 0 on miss — already flagged.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 461
- 474, The current PowerShell script written to ps1 (via tmpFile and writeFile)
uses Write-Error when no window matches, which still returns exit code 0; update
the script so the non-match path causes a non-zero exit (e.g., replace or follow
Write-Error with an explicit exit 1 or throw) so the calling focusWindow logic
can detect failure; locate the ps1 generation where tmpFile('focus','ps1') and
writeFile(ps1, `...`) are used and change the else branch to emit a real failure
(exit 1 or throw) instead of only Write-Error.

76-127: captureScreenshot skips the permission gate — already flagged.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 76 -
127, The captureScreenshot function currently performs screen capture
unconditionally; add a permission check at the start of captureScreenshot (e.g.,
call the existing permission gate helper such as ensurePermission or
checkAgentPermission, or verify the relevant permission flag) and abort/throw a
clear permission error or return a CallToolResult error object if the permission
is not granted before creating any temp files or invoking platform commands;
ensure you still clean up any temp artifacts (png, ps1) in error paths and keep
the function signature and returned CallToolResult shape unchanged.

433-449: Linux listWindows assigns window IDs as PIDs — already flagged.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 433
- 449, The code in listWindows is treating the xdotool window ID (wid) as the
process id by doing Number(wid); instead call xdotool getwindowpid to fetch the
real PID for each wid. In the wids.map async block (the execFile call that runs
['getwindowname', wid] and returns { name: ..., pid: Number(wid) }), add a
second execFile call to run ['getwindowpid', wid], parse that stdout into a
Number and use it as pid (and optionally include the original wid as id if
WindowInfo supports it); update the returned shape to return the correct pid and
handle errors as before.

🧹 Nitpick comments (2)

packages/agent-core/tests/unit/opencode/adapter.test.ts (2)
184-196: Local replicas of production logic — tests won't catch drift.

buildPs1Script and buildPtySpawnArgsWin32 duplicate the adapter's private logic. If the production buildPtySpawnArgs changes (e.g., switches quoting strategy, adds flags), these helpers won't fail. Consider extracting the PS1 script-building into a small, exported pure function (e.g., buildPs1ScriptContent(command, args)) and importing it in the test. This way the tests exercise the real code.

That said, this is consistent with the existing pattern in the file (lines 86–101 replicate escapeShellArg and buildShellCommand), and the file header acknowledges the limitation of not mocking the PTY layer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/tests/unit/opencode/adapter.test.ts` around lines 184 -
196, Tests duplicate the adapter's private PS1-building logic (buildPs1Script
and buildPtySpawnArgsWin32), which can drift from production; extract the PS1
script construction into a small exported pure function (e.g.,
buildPs1ScriptContent(command, args)) in the adapter module and have the test
import and call that exported function instead of the local buildPs1Script, then
update buildPtySpawnArgsWin32 in the test to use the exported
buildPs1ScriptContent for scriptContent so the tests exercise the real
production logic.
82-101: The getShellArgsWin32 helper and its tests now exercise a dead code path on Windows.

With the adapter's buildPtySpawnArgs now routing all win32 invocations through the PS1 wrapper, the cmd.exe /s /c wrapping tested by getShellArgsWin32 (line 99) and the tests in lines 103-164 no longer correspond to any reachable Windows code path. The escapeShellArgWin32 branch (line 86-91) also tests the win32 branch of escapeShellArg, which is now only reachable on non-Windows platforms (via buildShellCommand → used in the Unix shell path).

Consider either removing or annotating these tests to clarify they cover legacy/unreachable logic, so future maintainers don't mistakenly rely on them as validation of the current Windows behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/tests/unit/opencode/adapter.test.ts` around lines 82 -
101, The tests exercise unreachable Windows code paths: update the test file to
either remove or clearly mark the legacy helpers getShellArgsWin32 and
escapeShellArgWin32 (and the buildShellCommand-based tests) as covering
deprecated/unreachable logic now that buildPtySpawnArgs routes Win32 via the PS1
wrapper; specifically either delete the getShellArgsWin32/escapeShellArgWin32
helpers and associated specs or add a clear comment/skip/describe block noting
they validate legacy behavior only so maintainers won't assume they reflect
current Windows behavior driven by buildPtySpawnArgs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/desktop/src/main/permission-api.ts`:
- Around line 137-143: The synthesized uiData has a dead branch checking
rawData.details; because desktop-control spreads details into the body, details
is always undefined, so replace the ternary for isDesktopPermission to set
operation: String(rawData.operation) and filePath:
String(JSON.stringify(rawData)) (dropping the rawData.details check) and remove
the unnecessary "as unknown as FilePermissionRequestData" cast so the object
literal directly satisfies FilePermissionRequestData; update the uiData
assignment (and keep rawData: Record<string, unknown>) accordingly.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 265-285: The tmp file `txt` can leak if `writeFile(ps1)` throws
because the `finally` cleanup only runs after the `try` that starts after both
writes; update the flow so both writes are performed inside the same try/finally
that unlinks `txt` and `ps1` (i.e., move the first writeFile(txt, ...) into the
try block that currently surrounds execFile or wrap both writeFile calls and
execFile in a single try), ensuring the `finally` block in index.ts unlinks both
`txt` and `ps1` even if either writeFile throws; reference the tmp variables
`txt` and `ps1`, the writeFile calls, the execFile('powershell', ...) call, and
the unlink calls to locate and fix the code.
- Around line 315-322: WIN_KEYS is missing a mapping for "forwarddelete",
causing buildSendKeysString to fall back to an invalid {FORWARDDELETE} token on
Windows; add an entry "forwarddelete": '{DELETE}' to the WIN_KEYS Record so the
forward-delete key (mapped in APPLE_KEY_CODES as keycode 117) is translated to
the correct SendKeys token and will work on Windows.

In `@packages/agent-core/src/internal/classes/OpenCodeAdapter.ts`:
- Line 840: The temp PowerShell script filename uses Date.now() which can
collide; change the construction of scriptPath in OpenCodeAdapter (the const
scriptPath = path.join(os.tmpdir(), `accomplish-run-${Date.now()}.ps1`)) to use
crypto.randomUUID() for uniqueness (e.g.
`accomplish-run-${crypto.randomUUID()}.ps1`) so concurrent/rapid spawns don't
overwrite each other's files before writeFileSync/execute; ensure
crypto.randomUUID is used where scriptPath is created and no other code paths
still rely on Date.now() here.
- Around line 838-841: The temp PowerShell script file created in
startTask/spawnSessionResumption (scriptPath) is never removed causing a leak;
capture scriptPath on the instance (e.g., this._tempPsScriptPath) when writing
the file and ensure it is unlinked in the process exit/cleanup path by deleting
it from within the onExit/handleProcessExit or dispose routines
(handleProcessExit/dispose) — ensure deletion is attempted regardless of exit
reason and guard with try/catch so removal failures don't throw.

---

Duplicate comments:
In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 38-40: The tmpFile function can produce filename collisions under
concurrent calls because it only uses Date.now(), so replace the uniqueness
strategy in tmpFile with a robust unique suffix (e.g., crypto.randomUUID() or
crypto.randomBytes hex, or use fs.mkdtemp/tmp library) to guarantee uniqueness;
update the tmpFile implementation (function name tmpFile) to append that
random/UUID token (or use a safe OS temporary-file API) to the generated
filename and ensure the chosen approach is cross-platform and
collision-resistant.
- Around line 461-474: The current PowerShell script written to ps1 (via tmpFile
and writeFile) uses Write-Error when no window matches, which still returns exit
code 0; update the script so the non-match path causes a non-zero exit (e.g.,
replace or follow Write-Error with an explicit exit 1 or throw) so the calling
focusWindow logic can detect failure; locate the ps1 generation where
tmpFile('focus','ps1') and writeFile(ps1, `...`) are used and change the else
branch to emit a real failure (exit 1 or throw) instead of only Write-Error.
- Around line 76-127: The captureScreenshot function currently performs screen
capture unconditionally; add a permission check at the start of
captureScreenshot (e.g., call the existing permission gate helper such as
ensurePermission or checkAgentPermission, or verify the relevant permission
flag) and abort/throw a clear permission error or return a CallToolResult error
object if the permission is not granted before creating any temp files or
invoking platform commands; ensure you still clean up any temp artifacts (png,
ps1) in error paths and keep the function signature and returned CallToolResult
shape unchanged.
- Around line 433-449: The code in listWindows is treating the xdotool window ID
(wid) as the process id by doing Number(wid); instead call xdotool getwindowpid
to fetch the real PID for each wid. In the wids.map async block (the execFile
call that runs ['getwindowname', wid] and returns { name: ..., pid: Number(wid)
}), add a second execFile call to run ['getwindowpid', wid], parse that stdout
into a Number and use it as pid (and optionally include the original wid as id
if WindowInfo supports it); update the returned shape to return the correct pid
and handle errors as before.

---

Nitpick comments:
In `@packages/agent-core/tests/unit/opencode/adapter.test.ts`:
- Around line 184-196: Tests duplicate the adapter's private PS1-building logic
(buildPs1Script and buildPtySpawnArgsWin32), which can drift from production;
extract the PS1 script construction into a small exported pure function (e.g.,
buildPs1ScriptContent(command, args)) in the adapter module and have the test
import and call that exported function instead of the local buildPs1Script, then
update buildPtySpawnArgsWin32 in the test to use the exported
buildPs1ScriptContent for scriptContent so the tests exercise the real
production logic.
- Around line 82-101: The tests exercise unreachable Windows code paths: update
the test file to either remove or clearly mark the legacy helpers
getShellArgsWin32 and escapeShellArgWin32 (and the buildShellCommand-based
tests) as covering deprecated/unreachable logic now that buildPtySpawnArgs
routes Win32 via the PS1 wrapper; specifically either delete the
getShellArgsWin32/escapeShellArgWin32 helpers and associated specs or add a
clear comment/skip/describe block noting they validate legacy behavior only so
maintainers won't assume they reflect current Windows behavior driven by
buildPtySpawnArgs.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b40dea0 and b5db4ed.

📒 Files selected for processing (4)

apps/desktop/src/main/permission-api.ts
packages/agent-core/mcp-tools/desktop-control/src/index.ts
packages/agent-core/src/internal/classes/OpenCodeAdapter.ts
packages/agent-core/tests/unit/opencode/adapter.test.ts

coderabbitai · 2026-02-24T14:20:43Z

apps/desktop/src/main/permission-api.ts

+    const rawData = data as Record<string, unknown>;
+    const uiData: FilePermissionRequestData = isDesktopPermission
+      ? ({
+          operation: String(rawData.operation),
+          filePath: String(rawData.details ?? JSON.stringify(rawData)),
+        } as unknown as FilePermissionRequestData)
+      : (data as FilePermissionRequestData);


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Explore the file structure and locate relevant files find . -type f -name "permission-api.ts" | head -5 find . -type f -name "*desktop-control*" | head -10

Repository: accomplish-ai/accomplish

Length of output: 109

🏁 Script executed:

# Read the full permission-api.ts file to understand context wc -l apps/desktop/src/main/permission-api.ts

Repository: accomplish-ai/accomplish

Length of output: 111

🏁 Script executed:

# Look for FilePermissionRequestData type definition rg "FilePermissionRequestData" --type ts -B 3 -A 10

Repository: accomplish-ai/accomplish

Length of output: 16250

🏁 Script executed:

# Search for how buildFilePermissionRequest is implemented and used rg "buildFilePermissionRequest" --type ts -A 15 -B 2

Repository: accomplish-ai/accomplish

Length of output: 6181

🏁 Script executed:

# Check permission-api.ts imports to see exact type being used head -50 apps/desktop/src/main/permission-api.ts | grep -A 5 "import"

Repository: accomplish-ai/accomplish

Length of output: 267

🏁 Script executed:

# Look for FileOperation type definition rg "type FileOperation" --type ts -A 10

Repository: accomplish-ai/accomplish

Length of output: 956

🏁 Script executed:

# Search for desktop-control to see how it sends data find . -type f -name "*desktop-control*" | head -20

Repository: accomplish-ai/accomplish

Length of output: 50

🏁 Script executed:

# Search for where details field is set in desktop requests rg "details" --type ts | grep -i "desktop\|permission" | head -20

Repository: accomplish-ai/accomplish

Length of output: 1172

🏁 Script executed:

# Check what agent-core index exports as PermissionFileRequestData cat packages/agent-core/src/index.ts | grep -A 2 -B 2 "PermissionFileRequestData"

Repository: accomplish-ai/accomplish

Length of output: 291

🏁 Script executed:

# Look at the FILE_OPERATIONS definition to understand FileOperation values rg "FILE_OPERATIONS" --type ts -B 2 -A 15

Repository: accomplish-ai/accomplish

Length of output: 12953

🏁 Script executed:

# Check the full context around lines 130-150 in permission-api.ts sed -n '120,160p' apps/desktop/src/main/permission-api.ts

Repository: accomplish-ai/accomplish

Length of output: 1673

🏁 Script executed:

# Verify the exact type definition exported from agent-core grep -r "export.*FilePermissionRequestData" packages/agent-core/src/types/permission-handler.ts -A 10

Repository: accomplish-ai/accomplish

Length of output: 333

🏁 Script executed:

# Check if there are any other places that send requests with a 'details' field rg "details:" --type ts | grep -E "permission|file" | head -20

Repository: accomplish-ai/accomplish

Length of output: 50

🏁 Script executed:

# Verify validation is skipped for desktop permissions sed -n '110,125p' apps/desktop/src/main/permission-api.ts

Repository: accomplish-ai/accomplish

Length of output: 662

🏁 Script executed:

# Check what architecture.md says about spawning node/npx cat apps/desktop/src/main/architecture.md 2>/dev/null || echo "File not found, searching for it" find . -name "architecture.md" -type f

Repository: accomplish-ai/accomplish

Length of output: 123

🏁 Script executed:

# Verify that buildFilePermissionRequest doesn't validate the operation further sed -n '1,50p' packages/agent-core/src/services/permission-handler.ts | grep -A 30 "buildFilePermissionRequest"

Repository: accomplish-ai/accomplish

Length of output: 50

🏁 Script executed:

# Check if the PermissionRequest interface expects a FileOperation or string rg "fileOperation" packages/agent-core/src/common/types/permission.ts -A 2 -B 2

Repository: accomplish-ai/accomplish

Length of output: 202

🏁 Script executed:

# Confirm the exact content of buildFilePermissionRequest implementation sed -n '100,130p' packages/agent-core/src/services/permission-handler.ts

Repository: accomplish-ai/accomplish

Length of output: 1064

🏁 Script executed:

# Check if validateFilePermissionRequest is called for desktop permissions sed -n '100,115p' apps/desktop/src/main/permission-api.ts

Repository: accomplish-ai/accomplish

Length of output: 736

🏁 Script executed:

# Get the exact buildFilePermissionRequest implementation to see the full cast sed -n '130,150p' packages/agent-core/src/services/permission-handler.ts

Repository: accomplish-ai/accomplish

Length of output: 763

🏁 Script executed:

# Check if the renderer or downstream code validates fileOperation rg "fileOperation" --type ts -B 2 -A 5 | grep -A 10 "render\|ui\|display" | head -30

Repository: accomplish-ai/accomplish

Length of output: 2907

🏁 Script executed:

# Search for how fileOperation is used in the PermissionDialog or renderer rg "fileOperation" --type tsx -A 5 -B 2

Repository: accomplish-ai/accomplish

Length of output: 95

🏁 Script executed:

# Check if there's any validation or switching on fileOperation values in the renderer rg "FILE_OPERATIONS\|fileOperation" apps/web/src --type ts --type tsx -A 5 | head -50

Repository: accomplish-ai/accomplish

Length of output: 95

🏁 Script executed:

# Verify the exact line where buildFilePermissionRequest does the cast sed -n '55,75p' packages/agent-core/src/services/permission-handler.ts

Repository: accomplish-ai/accomplish

Length of output: 659

🏁 Script executed:

# Find the exact buildFilePermissionRequest implementation rg "buildFilePermissionRequest" packages/agent-core/src/services/permission-handler.ts -A 20

Repository: accomplish-ai/accomplish

Length of output: 637

🏁 Script executed:

# Look for PermissionDialog component to see how it uses fileOperation find . -name "PermissionDialog*" -type f

Repository: accomplish-ai/accomplish

Length of output: 131

🏁 Script executed:

# Check if any downstream code validates or restricts fileOperation values rg "fileOperation" --type ts -C 5 | grep -E "switch|case|if.*fileOperation|FILE_OPERATIONS" -A 3 -B 3

Repository: accomplish-ai/accomplish

Length of output: 50

🏁 Script executed:

# Check the PermissionDialog implementation cat apps/web/src/client/components/execution/PermissionDialog.tsx | head -150

Repository: accomplish-ai/accomplish

Length of output: 5830

🏁 Script executed:

# Search for where fileOperation is actually used/displayed in the UI rg "fileOperation" apps/web/src/client/components/execution/ -B 3 -A 5

Repository: accomplish-ai/accomplish

Length of output: 2239

Dead branch in filePath synthesis — remove unused rawData.details check.

desktop-control spreads details directly into the POST body ({ operation, ...details }), so rawData.details is always undefined. The conditional rawData.details ?? JSON.stringify(rawData) never takes the first branch; filePath always ends up as the full JSON.stringify(rawData), which redundantly includes the operation key. Simplify to:

const uiData: FilePermissionRequestData = isDesktopPermission ? { operation: String(rawData.operation), - filePath: String(rawData.details ?? JSON.stringify(rawData)), - } as unknown as FilePermissionRequestData + filePath: JSON.stringify( + Object.fromEntries( + Object.entries(rawData).filter(([k]) => k !== 'operation'), + ), + ), + } : (data as FilePermissionRequestData);

The as unknown as FilePermissionRequestData cast is unnecessary after this change; the object literal already satisfies the type.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/desktop/src/main/permission-api.ts` around lines 137 - 143, The synthesized uiData has a dead branch checking rawData.details; because desktop-control spreads details into the body, details is always undefined, so replace the ternary for isDesktopPermission to set operation: String(rawData.operation) and filePath: String(JSON.stringify(rawData)) (dropping the rawData.details check) and remove the unnecessary "as unknown as FilePermissionRequestData" cast so the object literal directly satisfies FilePermissionRequestData; update the uiData assignment (and keep rawData: Record<string, unknown>) accordingly.

packages/agent-core/mcp-tools/desktop-control/src/index.ts

packages/agent-core/src/internal/classes/OpenCodeAdapter.ts

desktop-control/src/index.ts: - Use crypto.randomUUID() in tmpFile() instead of Date.now() to prevent filename collisions under concurrent async calls - Add session-level consent gate to captureScreenshot(): prompt once per process lifetime via checkPermission, deny subsequent calls if the session was rejected, emit a structured audit log on each capture - Move both writeFile(txt) and writeFile(ps1) calls inside the try block in typeText() so txt is never leaked if writeFile(ps1) throws - Add forwarddelete: '{DELETE}' to WIN_KEYS so the forward-delete key doesn't silently misfire on Windows (mirroring APPLE_KEY_CODES 117) permission-api.ts: - Fix dead rawData.details branch in /desktop-permission synthesis; since desktop-control spreads all detail fields into the POST body rawData.details is always undefined — instead strip the 'operation' key from rawData and JSON-stringify the remainder for filePath OpenCodeAdapter.ts: - Switch Windows spawn from powershell.exe .ps1 wrapper to direct opencode.exe spawn: pass command+args directly to node-pty so the OS hands each element as a raw argv entry with no shell quoting needed; this also eliminates the temp-file leak and Date.now() collision concerns that the previous PS1 approach introduced - Update adapter unit tests to reflect the new direct-spawn contract

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

packages/agent-core/mcp-tools/desktop-control/src/index.ts (2)

455-465: Linux listWindows sets pid to the window id.
xdotool search yields window IDs, not process IDs. Use xdotool getwindowpid and leave pid undefined when unavailable.

Proposed fix

     const results = await Promise.all(
       wids.map(async (wid) => {
         try {
           const { stdout: name } = await execFile('xdotool', ['getwindowname', wid]);
-          return { name: name.trim(), pid: Number(wid) };
+          let pid: number | undefined;
+          try {
+            const { stdout: pidOut } = await execFile('xdotool', ['getwindowpid', wid]);
+            pid = Number(pidOut.trim());
+          } catch {
+            pid = undefined;
+          }
+          return { name: name.trim(), pid };
         } catch {
           return null;
         }
       }),
     );

xdotool getwindowpid documentation _NET_WM_PID behavior

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 455
- 465, In the listWindows implementation, the code incorrectly assigns the
window id to pid; change the per-window lookup (wids.map handler) to call
"xdotool getwindowpid <wid>" via execFile to obtain the real process id, parse
it to Number and set pid to that value, and if getwindowpid fails or returns
non-numeric output leave pid as undefined; keep the existing getwindowname call
for name, reference the mapped variable wid and the surrounding Promise.all
logic to locate where to update the behavior.

493-495: PowerShell focus script can report success on failure.
Write-Error is non-terminating; add a non-zero exit so execFile can surface the failure.

Proposed fix

-if($proc) { [WinFocus]::SetForegroundWindow($proc.MainWindowHandle) }
-else { Write-Error "No window found matching: $Title" }
+if($proc) { [WinFocus]::SetForegroundWindow($proc.MainWindowHandle) }
+else { Write-Error "No window found matching: $Title"; exit 1 }

PowerShell Write-Error exit code non-terminating error behavior

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 493
- 495, The PowerShell snippet that finds a process ($proc) and calls
[WinFocus]::SetForegroundWindow can still return a zero exit code when no window
is found because Write-Error is non-terminating; update the script emitted by
the code that builds this PowerShell string so that the failure branch uses a
non-zero exit (e.g., replace or follow Write-Error "No window found matching:
$Title" with either "Write-Error ...; exit 1" or "throw 'No window found
matching: $Title'") and optionally add an explicit "exit 0" after
[WinFocus]::SetForegroundWindow to guarantee success; ensure this change is made
where the PowerShell string containing $proc, [WinFocus]::SetForegroundWindow,
and Write-Error is constructed so execFile can detect failures.

🧹 Nitpick comments (4)

packages/agent-core/tests/unit/opencode/adapter.test.ts (1)

179-180: Comment restates what the code already says — remove or replace with a why comment.

// mirrors buildPtySpawnArgs for win32: command → file, args → args describes what the one-liner does, which the return statement itself makes obvious. Per coding guidelines, comments must explain why, not what.

✂️ Proposed fix

-    function buildPtySpawnArgsWin32(command: string, args: string[]) {
-      // mirrors buildPtySpawnArgs for win32: command → file, args → args
-      return { file: command, args };
-    }
+    function buildPtySpawnArgsWin32(command: string, args: string[]) {
+      // stub that documents the expected contract: direct spawn with no shell wrapping
+      return { file: command, args };
+    }

As per coding guidelines: "No unnecessary comments — Don't add comments that restate what the code does. Comments should explain why, not what."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/tests/unit/opencode/adapter.test.ts` around lines 179 -
180, Remove the redundant comment "// mirrors buildPtySpawnArgs for win32:
command → file, args → args" that only restates the return statement and either
delete it or replace it with a brief "why" comment explaining the platform
mapping rationale (e.g., why win32 expects {file, args} vs other platforms) so
readers understand the reason for mirroring buildPtySpawnArgs; locate the return
{ file: command, args } in the test around the buildPtySpawnArgs behavior and
update the comment accordingly.

apps/desktop/src/main/permission-api.ts (1)

73-110: Remove comments that just restate the control flow.
The comments at Line 73-75 and Line 98-109 simply narrate the conditionals; consider removing them to keep comments focused on “why”.

Suggested cleanup

-    // Handle both file permissions (/permission) and desktop permissions (/desktop-permission)
     const isFilePermission = req.method === 'POST' && req.url === '/permission';
     const isDesktopPermission = req.method === 'POST' && req.url === '/desktop-permission';
     if (!isFilePermission && !isDesktopPermission) {
       res.writeHead(404, { 'Content-Type': 'application/json' });
       res.end(JSON.stringify({ error: 'Not found' }));
       return;
     }
...
-    // For file permissions, validate the payload strictly.
-    // For desktop permissions, only require an `operation` field (no filePath needed).
     if (isFilePermission) {
       const validation = permissionHandler.validateFilePermissionRequest(data);
       if (!validation.valid) {
         res.writeHead(400, { 'Content-Type': 'application/json' });
         res.end(JSON.stringify({ error: validation.error }));
         return;
       }
     } else {
-      // desktop-permission: just require an operation string
       const d = data as Record<string, unknown>;
       if (!d.operation || typeof d.operation !== 'string') {
         res.writeHead(400, { 'Content-Type': 'application/json' });
         res.end(JSON.stringify({ error: 'operation is required for desktop permissions' }));
         return;
       }
     }

As per coding guidelines: No unnecessary comments — Don't add comments that restate what the code does. Comments should explain why, not what.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/main/permission-api.ts` around lines 73 - 110, Remove the
redundant “what the code does” comments around the request-type checks and
validation branches: delete the comment that states the control flow near the
isFilePermission / isDesktopPermission declarations and delete the comment that
explains the difference between file and desktop permissions above the
isFilePermission conditional; if you need a comment, replace them with a concise
“why” note (e.g., why desktop-permission omits filePath) or a short reference to
the validation behavior for permissionHandler.validateFilePermissionRequest, but
do not keep comments that merely restate the conditions or the HTTP handling
logic.

packages/agent-core/mcp-tools/desktop-control/src/index.ts (2)

160-165: Replace the nested ternary with a small map.
Keeps the intent readable and aligns with the no-nested-ternary guideline.

Refactor

-    const cmd =
-      button === 'right'
-        ? 'right click'
-        : button === 'double'
-          ? 'double click'
-          : 'click';
+    const cmdMap: Record<'left' | 'right' | 'double', string> = {
+      left: 'click',
+      right: 'right click',
+      double: 'double click',
+    };
+    const cmd = cmdMap[button];

As per coding guidelines: Avoid nested ternaries — Use mapper objects or if/else for readability.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 160
- 165, The nested ternary that builds cmd from button (const cmd = button ===
'right' ? 'right click' : button === 'double' ? 'double click' : 'click') should
be replaced with a small mapping object; create a map (e.g., const CLICK_MAP = {
right: 'right click', double: 'double click' }) and compute cmd =
CLICK_MAP[button] || 'click' so the intent is clearer and avoids nested
ternaries (refer to the cmd variable and the button value in index.ts).

375-383: Avoid nested ternary in buildSendKeysString.
This is a straightforward if/else and reads clearer.

Refactor

-  const k =
-    mainKey in WIN_KEYS
-      ? WIN_KEYS[mainKey]
-      : mainKey.length === 1
-        ? mainKey
-        : `{${mainKey.toUpperCase()}}`;
+  let k: string;
+  if (mainKey in WIN_KEYS) {
+    k = WIN_KEYS[mainKey];
+  } else if (mainKey.length === 1) {
+    k = mainKey;
+  } else {
+    k = `{${mainKey.toUpperCase()}}`;
+  }

As per coding guidelines: Avoid nested ternaries — Use mapper objects or if/else for readability.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts` around lines 375
- 383, The nested ternary inside buildSendKeysString (which computes k using
WIN_KEYS and mainKey) should be replaced with a clear if/else to improve
readability: check if mainKey is in WIN_KEYS and set k = WIN_KEYS[mainKey]; else
if mainKey.length === 1 set k = mainKey; otherwise set k =
`{${mainKey.toUpperCase()}}`. Update the function to use this explicit
conditional flow and keep the existing prefix computation with WIN_MODIFIERS and
parseKeyParts unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 526-529: The tool metadata for the 'desktop_screenshot' tool still
says "No permission required" despite the implementation now requesting
session-level consent; update the description string in the desktop_screenshot
tool definition to state that session consent is required (e.g., "Requires user
session consent to capture a screenshot; returns a base64-encoded PNG of the
primary display") so the description matches the behavior in the consent flow.

In `@packages/agent-core/tests/unit/opencode/adapter.test.ts`:
- Around line 177-229: The test stub buildPtySpawnArgsWin32 lacks the production
.exe validation present in OpenCodeAdapter.ts (the .exe suffix check in
buildPtySpawnArgs), so add that validation to the test helper or use the real
function: either import/require the actual buildPtySpawnArgs from
OpenCodeAdapter (preferred) and call the win32 branch, or re-implement the same
`.endsWith('.exe')` check in buildPtySpawnArgsWin32 and add a negative test
asserting an error is thrown for non-.exe commands; also replace the redundant
comment with one that documents this block as a contract/regression test and
explicitly calls out which production behaviour (the .exe validation) is being
verified or cannot be verified if importing is impossible.

---

Duplicate comments:
In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 455-465: In the listWindows implementation, the code incorrectly
assigns the window id to pid; change the per-window lookup (wids.map handler) to
call "xdotool getwindowpid <wid>" via execFile to obtain the real process id,
parse it to Number and set pid to that value, and if getwindowpid fails or
returns non-numeric output leave pid as undefined; keep the existing
getwindowname call for name, reference the mapped variable wid and the
surrounding Promise.all logic to locate where to update the behavior.
- Around line 493-495: The PowerShell snippet that finds a process ($proc) and
calls [WinFocus]::SetForegroundWindow can still return a zero exit code when no
window is found because Write-Error is non-terminating; update the script
emitted by the code that builds this PowerShell string so that the failure
branch uses a non-zero exit (e.g., replace or follow Write-Error "No window
found matching: $Title" with either "Write-Error ...; exit 1" or "throw 'No
window found matching: $Title'") and optionally add an explicit "exit 0" after
[WinFocus]::SetForegroundWindow to guarantee success; ensure this change is made
where the PowerShell string containing $proc, [WinFocus]::SetForegroundWindow,
and Write-Error is constructed so execFile can detect failures.

---

Nitpick comments:
In `@apps/desktop/src/main/permission-api.ts`:
- Around line 73-110: Remove the redundant “what the code does” comments around
the request-type checks and validation branches: delete the comment that states
the control flow near the isFilePermission / isDesktopPermission declarations
and delete the comment that explains the difference between file and desktop
permissions above the isFilePermission conditional; if you need a comment,
replace them with a concise “why” note (e.g., why desktop-permission omits
filePath) or a short reference to the validation behavior for
permissionHandler.validateFilePermissionRequest, but do not keep comments that
merely restate the conditions or the HTTP handling logic.

In `@packages/agent-core/mcp-tools/desktop-control/src/index.ts`:
- Around line 160-165: The nested ternary that builds cmd from button (const cmd
= button === 'right' ? 'right click' : button === 'double' ? 'double click' :
'click') should be replaced with a small mapping object; create a map (e.g.,
const CLICK_MAP = { right: 'right click', double: 'double click' }) and compute
cmd = CLICK_MAP[button] || 'click' so the intent is clearer and avoids nested
ternaries (refer to the cmd variable and the button value in index.ts).
- Around line 375-383: The nested ternary inside buildSendKeysString (which
computes k using WIN_KEYS and mainKey) should be replaced with a clear if/else
to improve readability: check if mainKey is in WIN_KEYS and set k =
WIN_KEYS[mainKey]; else if mainKey.length === 1 set k = mainKey; otherwise set k
= `{${mainKey.toUpperCase()}}`. Update the function to use this explicit
conditional flow and keep the existing prefix computation with WIN_MODIFIERS and
parseKeyParts unchanged.

In `@packages/agent-core/tests/unit/opencode/adapter.test.ts`:
- Around line 179-180: Remove the redundant comment "// mirrors
buildPtySpawnArgs for win32: command → file, args → args" that only restates the
return statement and either delete it or replace it with a brief "why" comment
explaining the platform mapping rationale (e.g., why win32 expects {file, args}
vs other platforms) so readers understand the reason for mirroring
buildPtySpawnArgs; locate the return { file: command, args } in the test around
the buildPtySpawnArgs behavior and update the comment accordingly.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5db4ed and 2e130eb.

📒 Files selected for processing (4)

apps/desktop/src/main/permission-api.ts
packages/agent-core/mcp-tools/desktop-control/src/index.ts
packages/agent-core/src/internal/classes/OpenCodeAdapter.ts
packages/agent-core/tests/unit/opencode/adapter.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

packages/agent-core/src/internal/classes/OpenCodeAdapter.ts

packages/agent-core/mcp-tools/desktop-control/src/index.ts

packages/agent-core/tests/unit/opencode/adapter.test.ts

buildPtySpawnArgsWin32 was a bare pass-through that gave zero regression protection for the .exe suffix guard in the production buildPtySpawnArgs (which throws immediately on non-.exe commands). Changes: - Re-implement the .exe check inside the local stub used by the test block - Add a dedicated negative test: asserts Error is thrown for both a path without .exe and a Unix-style path — covering the exact production guard - Expand block comment to document why this is a contract/regression test and to call out that buildPtySpawnArgs is private (cannot be imported) - Remove the redundant 'mirrors...' inline comment (restated what, not why) Addresses CodeRabbitAI review comment on PR accomplish-ai#633.

…ailures When the Gemini API has no context cache (fresh session after a rate-limit reset or app restart), the model ignored the MCP-registered tool names and tried to call 'start_task' / 'complete_task' literally from the system prompt text. OpenCode registers these as 'start-task_start_task' and 'complete-task_complete_task', so the calls hit the 'invalid' tool handler and the task failed immediately before any desktop automation ran. Root cause confirmed via opencode log: MCP servers DID connect successfully (toolCount=7 for desktop-control, toolCount=1 for start-task/complete-task), but the first Gemini inference call — with cache:{read:0} — used the literal names from the prompt rather than the registered MCP names. Fix: replace every 'start_task' and 'complete_task' call-site in the system prompt with 'start-task_start_task' and 'complete-task_complete_task'. Update the config-generator tests to match.

RajdeepKushwaha5 · 2026-02-24T17:43:22Z

@RajdeepKushwaha5 can you attach a short video demonstrating these new capabilities?

Sure

final_fix_working.mp4

galElmalah · 2026-02-26T07:28:43Z

@RajdeepKushwaha5 I don't see the side effect i.e the notepad being written into

…top-control

…tests - Wrap server startup in main() with .catch() per MCP tool convention - Windows focusWindow: escape wildcard chars via WildcardPattern::Escape; add exit 1 to make Write-Error terminating - Linux listWindows: use xdotool getwindowpid to return real process IDs instead of X11 window IDs - Update screenshot tool description to reflect session-level consent gate - typeText (Windows): fix temp-file leak — move both writeFile calls inside try so finally always cleans up txt and ps1 - tests: update 5 system-prompt assertions to use full MCP tool names (complete-task_complete_task, start-task_start_task) to match last commit - tests: fix 2 pre-existing Windows path-separator failures by using path.join('dist', 'index.mjs') in assertions

- Add windowsHide:true to all Windows PowerShell execFile calls so the console window no longer steals keyboard focus from the target app - In typeText: capture foreground window handle before writing to clipboard, then restore it with SetForegroundWindow before SendKeys so paste lands in the correct window even if PS spawning shifts focus - In focusWindow: add 400ms sleep after SetForegroundWindow to let the OS fully activate the window before callers proceed to type - In captureScreenshot: scale bitmap to max 1280px wide (HighQuality- Bicubic) so base64 payload stays within model vision input limits

SetForegroundWindow silently fails from hidden/background processes due to Windows foreground locking rules. Replace it with WScript.Shell.AppActivate which activates windows by title from any process, then use WScript.Shell .SendKeys to paste -- guaranteeing the keystrokes reach the correct window.

…ext targeting focusWindow now uses WScript.Shell.AppActivate (works from background processes) and writes the exact window title to dc-focused-window.txt in TEMP. typeText reads that file to re-activate the correct window before pasting, eliminating the race where Electron regains focus between the two tool calls.

…ndows - focusWindow now persists process PID (not title) to dc-focused-window.txt, making it immune to title changes (e.g. Notepad prepends '*' on unsaved edits) - typeText adopts a three-layer strategy on Windows: 1. UI Automation ValuePattern: programmatic append with no focus requirement, works from any background process (Notepad, WordPad, Win32/WinUI 3 editors) 2. UIA SetFocus + clipboard paste: for editors that support UIA focus but not ValuePattern 3. AppActivate(pid) + clipboard paste: immune fallback using stored PID - Addresses code-review feedback: fixes the invisible side-effect issue where typeText failed silently after Notepad window title changed post-focusWindow

RajdeepKushwaha5 · 2026-03-01T10:54:37Z

@RajdeepKushwaha5 I don't see the side effect i.e the notepad being written into

Hi Sir @galElmalah ! Here's a video walkthrough of the recent updates — Windows spawn fixes, desktop automation features, and code review improvements. Let me know if you have any questions!

accomplish_change.mp4

mavishay · 2026-03-10T08:37:09Z

Jira ticket: https://accomplish-ai.atlassian.net/browse/ENG-688

…top-control

Copilot AI review requested due to automatic review settings February 22, 2026 18:51

Copilot started reviewing on behalf of RajdeepKushwaha5 February 22, 2026 18:52 View session

Copilot AI reviewed Feb 22, 2026

View reviewed changes

coderabbitai bot reviewed Feb 22, 2026

View reviewed changes

packages/agent-core/mcp-tools/desktop-control/src/index.ts Show resolved Hide resolved

packages/agent-core/mcp-tools/desktop-control/src/index.ts Outdated Show resolved Hide resolved

RajdeepKushwaha5 added 2 commits February 24, 2026 18:01

RajdeepKushwaha5 force-pushed the feat/issue-189-desktop-control branch from 10c8dc0 to b40dea0 Compare February 24, 2026 12:32

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

packages/agent-core/mcp-tools/desktop-control/src/index.ts Show resolved Hide resolved

packages/agent-core/mcp-tools/desktop-control/src/index.ts Show resolved Hide resolved

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

packages/agent-core/mcp-tools/desktop-control/src/index.ts Show resolved Hide resolved

packages/agent-core/tests/unit/opencode/adapter.test.ts Show resolved Hide resolved

galElmalah added the feb_hackathon label Feb 24, 2026

RajdeepKushwaha5 added 2 commits February 24, 2026 21:22

RajdeepKushwaha5 force-pushed the feat/issue-189-desktop-control branch from 35517a0 to 19f95e5 Compare February 24, 2026 17:33

Merge branch 'main' into feat/issue-189-desktop-control

2a4f196

RajdeepKushwaha5 added 6 commits March 1, 2026 12:53

Merge remote-tracking branch 'upstream/main' into feat/issue-189-desk…

6dfdb6c

…top-control

Merge remote-tracking branch 'upstream/main' into feat/issue-189-desk…

739cff9

…top-control

		const transport = new StdioServerTransport();
		await server.connect(transport);

	$proc = Get-Process \| Where-Object {$_.MainWindowTitle -like "$Title"} \| Select-Object -First 1
	$escapedTitle = [System.Management.Automation.WildcardPattern]::Escape($Title)
	$proc = Get-Process \| Where-Object {$_.MainWindowTitle -like "$escapedTitle"} \| Select-Object -First 1

	Capture the entire primary display. Returns a base64-encoded PNG image. No permission required.
	Capture the entire primary display. Returns a base64-encoded PNG image. Permission required; each capture must be explicitly approved by the user (via the same permission model used for other desktop tools).

Conversation

RajdeepKushwaha5 commented Feb 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tools

Security design

Platform implementation

What's different from existing PRs

Integration

Testing

Summary by CodeRabbit

Uh oh!

orcaman commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

coderabbitai bot commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

galElmalah commented Feb 24, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

RajdeepKushwaha5 commented Feb 24, 2026

Uh oh!

galElmalah commented Feb 26, 2026

Uh oh!

RajdeepKushwaha5 commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mavishay commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

RajdeepKushwaha5 commented Feb 22, 2026 •

edited by coderabbitai bot

Loading

orcaman commented Feb 22, 2026 •

edited

Loading

coderabbitai bot commented Feb 22, 2026 •

edited

Loading

RajdeepKushwaha5 commented Mar 1, 2026 •

edited

Loading