Getting started

From a clean checkout to a complete agent-driven session against a real Electron app — launch, read the UI, interact, assert, capture, stop. The walkthrough drives the bundled examples/minimal-app, a ~30-line form app; every call shown below is the same flow its scripted scenario automates, and a gated repository test executes this exact sequence against real Electron so the tutorial stays runnable.

Prerequisites

Node.js 24 or newer (see engines in package.json; the server has no native dependencies).
pnpm via Corepack: corepack enable.

Install and build

git clone https://github.com/electron-stagewright/electron-stagewright.git
cd electron-stagewright
pnpm install
pnpm build

The MCP server entry is now at packages/core/dist/cli.js. It speaks the Model Context Protocol over stdio: an MCP host spawns it and exchanges JSON-RPC frames over the child's stdin/stdout.

Connect a client

Option A — an MCP host (Claude Desktop, Cursor, or any MCP-capable agent host). Register the server with the host's MCP configuration; the shape is the same everywhere:

{
  "mcpServers": {
    "electron-stagewright": {
      "command": "node",
      "args": ["/absolute/path/to/electron-stagewright/packages/core/dist/cli.js"]
    }
  }
}

Useful server flags (append to args): --screenshot-dir <dir> for a stable screenshot location, --allow-eval to register the JavaScript-evaluation tools (off by default; grant the narrowest target with --allow-eval=renderer or --allow-eval=main), --plugin <name> to load a plugin. The full list is in the tool reference.

That node …/dist/cli.js form points at this cloned checkout, which is what runs the bundled example. To wire the published package into your client and drive your own app instead — with the npx/global command forms and per-client config — see Connect your MCP client.

Option B — the scripted scenario, no host required. It connects a real MCP client over stdio and prints a transcript of every call:

pnpm --filter @electron-stagewright/example-minimal-app scenario

The rest of this guide walks the same steps one call at a time, as an agent would make them.

1. Launch the app

electron_launch { "main": "/absolute/path/to/examples/minimal-app/main.js" }

{ "ok": true, "session_id": "…", "transport": "playwright-electron", "windows": [...], "renderer_ready": true }

main must be absolute. The call waits (up to readyTimeoutMs, default 5000 ms) for the renderer to finish its initial render, so the very next read sees a populated app; renderer_ready: false means the wait expired — the session is still usable, retry the read or wait for a known element. Keep the session_id: every later call takes it (and may omit it while this is the only session).

2. Read the UI — snapshot

electron_snapshot {}

The snapshot is the agent's eyes: the renderer's accessibility tree as a flat list of entries — role, accessible name, state, bbox, and a stable numbered ref for every interactive element. For the minimal app that includes the Your name textbox, the Subscribe to updates checkbox, the Plan select, and the Greet button. Refs are tagged onto the DOM (data-sw-ref="N"), survive re-renders of the same element, and are what interaction tools accept — no CSS guessing. On later reads, pass since: "last" to get only what changed instead of the whole tree.

3. Find an element the agent-native way

CSS selectors work everywhere a ref does, but the declarative path needs no DOM knowledge:

electron_find { "role": "button", "name_contains": "Greet" }

{ "ok": true, "matches": [{ "ref": 4, "role": "button", "name": "Greet", "bbox": {...} }], "count": 1, "renderer_reloaded": false }

4. Interact

Fill the form (selector-addressed writes), then click the found button by ref:

electron_type { "selector": "#name", "text": "Ada Lovelace" }
electron_check { "selector": "#subscribe" }
electron_select_option { "selector": "#plan", "values": ["pro"] }
electron_click { "ref": 4 }

Every interaction returns { ok, ... } or a structured error — e.g. clicking a ref that a re-render invalidated returns REF_NOT_FOUND with similar_refs (candidates that look like the element you meant) so the agent recovers in one step instead of re-scanning blind.

5. Assert the outcome — one call, not a read-compare-retry loop

electron_expect_text { "selector": "#status", "contains": "Hello, Ada Lovelace" }

{ "ok": true, "matched": true, "actual": "Hello, Ada Lovelace! Plan: pro." }

expect_text polls server-side until the predicate holds or timeoutMs expires — the read, the comparison, and the retry loop collapse into a single MCP round-trip. On failure it returns EXPECTATION_FAILED carrying both expected and actual, so the agent sees what really happened without a follow-up read. The whole expect_* family works this way — Assert UI state covers it.

6. Capture evidence

electron_screenshot { "dir": "/absolute/path/for/artifacts" }
electron_console_logs { "match": "greeted" }

The screenshot is written on the server host and the call returns its path (pass dir, or start the server with --screenshot-dir, to keep artifacts out of the OS temp dir). Console output is captured continuously from launch; match / type / since filter it at read time. Capture diagnostics goes deeper, including session traces.

7. Stop

electron_stop {}

Always stop — even on failure paths — so no app process outlives the session. If the app ignores the graceful close, the stop escalates to SIGKILL after a bounded budget and reports escalated: true; the process is never orphaned.

Where next

Connect your MCP client — wire the published package into Claude Desktop, Cursor, or any MCP host, and confirm it connected.
Launch, attach, or inject — driving YOUR app, including apps that are already running.
Assert UI state — the assertion and wait toolbox.
Capture diagnostics — screenshots, console, dialogs, traces.
Security model — read before you enable --allow-eval or expose the server: the trust model and the controls behind it.
TOOL-REFERENCE.md — the full tool contracts.

Design background: numbered refs and the snapshot schema are ADR-005; the response envelope and error-code registry are ADR-006; the agent-native principles behind find and the expect family are ADR-007.