Security model and threat model

This page is the canonical threat model for the Electron Stagewright MCP server. It states what the server can touch, who it trusts, what stops misuse, and what risk remains. If you are deciding whether to point an agent at the server, read this first. The posture summarised here is recorded as a decision in ADR-014; to report a vulnerability see SECURITY.md.

The one-line model

The server is a privileged local tool, not a sandbox. It runs with your OS privileges, drives a real desktop app, and — when you enable the --allow-eval policy — runs arbitrary JavaScript inside that app. Treat it the way you would treat a shell: only let a trusted agent host invoke it. The default transport is stdio (a local child process), so the trust boundary stays local unless you deliberately put a network in front of it.

Assets

What an attacker would want, in rough order of value:

The host machine. The server can launch processes and read files within its launch surface, with the operator's privileges.
The target app's runtime. Under the --allow-eval policy, arbitrary main- and/or renderer-process code; without it, the granular tools still drive the app (click, type, navigate).
Captured data. Screenshots, console logs, and session traces can contain secrets the app displayed; IPC capture can record channel payloads; network capture can record request/response headers (and, when opted in, bodies).
Code-signing identity. The production_validate tool reads signed .app bundles and their updater feeds, and may return bounded evidence such as a signing authority in its local tool result.

Trust boundaries

Agent host → server. The agent supplies every tool input. Inputs are treated as untrusted and possibly hostile (a hallucinating or prompt-injected agent).
Server → target app. The server drives the app and, under eval, runs code in it. The app is assumed at least semi-trusted (it is the thing under test).
Server → host filesystem. Launch paths, screenshot output, and trace artifacts touch disk.

Threat actors

A misbehaving agent — hallucinated or prompt-injected tool calls. The primary actor the controls below target.
A malicious app under test — could try to abuse the driving channel. Out of primary scope (you chose to test it), but the server avoids handing it the protocol channel or unbounded waits.
A local reader of artifacts — anyone who can read the trace/screenshot output directory.

Controls (threats × mitigations)

Threat	Control	Residual
Arbitrary code execution via eval	`electron_eval_main` / `electron_eval_renderer` are unregistered unless `--allow-eval` permits their target (per-target least privilege — `--allow-eval=renderer` grants only the renderer); payloads pass a keyword blocklist and a structural AST check; calls are audited to stderr (length + a content hash, never the payload); results are size-capped	The eval checks are defence-in-depth, bypassable by a determined payload — see below
A plugin running main-process code behind the operator's back	Any plugin using the eval seam (`transport.evaluate('main')`) re-asserts the main eval opt-in (`--allow-eval=main`, or bare `--allow-eval`) at its own tool boundary; today that covers `ipc_capture_start`, `ipc_captured`, `ipc_capture_stop`, `ipc_invoke`, and `ipc_stub` (ADR-010)	—
Over-broad IPC capture / injection	`ipc_capture_start` requires an explicit channel allowlist; `ipc_stub` is allowlist-bound; `ipc_invoke` has an optional allowlist; `redact` drops named fields	Capture defaults are not redacted unless configured
Secret headers or bodies via over-broad network capture	`network_capture_start` requires an explicit URL allowlist (no capture-everything); `authorization` / `cookie` / `set-cookie` are redacted by default (`redactHeaders` adds more); bodies are opt-in (`captureBodies`, off by default) and, when on, bounded by a byte cap + a text-ish content-type gate, and droppable to size-only or `redactBodies` (ADR-016)	A careless allowlist with `redactSecureDefaults: false` can still surface header values; an opted-in `captureBodies` surfaces body content (not value-redacted unless `redactBodies`); renderer page-target traffic only (Playwright launch-mode and CDP attach-mode), not the main process's `net` module
App input altered by network stubbing	`network_stub` MODIFIES what the app receives (fulfill/abort), so it is bounded the same way: an explicit URL allowlist (no stub-everything), the `canIntercept` capability, and a first-party, operator-loaded plugin; it runs no app JavaScript and is not `--allow-eval` gated (ADR-016)	A loaded plugin can alter allowlisted responses; the operator chose to load it. Renderer page-target traffic only (Playwright launch-mode and CDP attach-mode)
App behaviour altered by clock control	`clock_` MODIFIES the time the app sees (install / freeze / advance the fake clock), so it is bounded by the `canControlClock` capability and a first-party, operator-loaded plugin*; it runs no app JavaScript and is not `--allow-eval` gated, and is not a secret surface (ADR-017)	A loaded plugin can drive the app's clock; the operator chose to load it. Playwright launch transport only
Cookie secrets via the storage read paths	`storage_cookies` / `storage_snapshot` redact cookie values by default (replaced with `[redacted]`; names/domains/paths/flags are kept); only `revealValues: true` surfaces them. Bounded by the `canAccessStorage` capability and a first-party, operator-loaded plugin; runs no app JavaScript and is not `--allow-eval` gated (ADR-018)	With `revealValues: true` the agent sees cookie values verbatim (a session/auth token can be one); `localStorage` snapshot values are NOT redacted (app state — treat the snapshot as sensitive if your app stores tokens there); cookies + the visited origins' `localStorage` snapshot only (Playwright launch full; CDP attach cookies full, `localStorage` best-effort)
App state altered by storage writes	`storage_set_cookie` / `storage_clear_cookies` MODIFY app state (seed/clear a cookie), so they are bounded the same way: the `canAccessStorage` capability and a first-party, operator-loaded plugin; they run no app JavaScript and are not `--allow-eval` gated (ADR-018)	A loaded plugin can seed or clear cookies; the operator chose to load it. Per-key `localStorage` / `sessionStorage` and IndexedDB writes are the renderer-eval rows below
Per-key Web Storage via renderer eval	`storage_local_` / `storage_session_` (get/set/remove/keys/clear) read and mutate a single `localStorage` / `sessionStorage` key by running a fixed renderer body (the agent supplies op/scope/key/value as DATA, never code), so they are renderer-eval gated: unregistered unless `--allow-eval=renderer` (or bare `--allow-eval`) permits the renderer target (the dispatcher hides them otherwise) AND re-asserted at the tool boundary (`storage.EVAL_REQUIRED`); also bounded by the `supportsRendererEval` capability and a first-party, operator-loaded plugin (ADR-018)	A loaded plugin under a renderer-eval grant can read or mutate Web Storage; the operator chose both. Web Storage values are NOT redacted (app state — treat reads as sensitive if the app stores tokens there). Playwright launch + CDP attach (`supportsRendererEval`); the injector returns `storage.UNSUPPORTED`. IndexedDB is the row below
IndexedDB read/write via renderer eval	`storage_idb_` (schema/get/keys/count/set/delete/clear) read and mutate records in existing* databases / object stores via a fixed async renderer body (the agent supplies database/store/key/value as DATA, never code), renderer-eval gated exactly like the Web Storage row (registration gate + `storage.EVAL_REQUIRED` re-assert + `supportsRendererEval` + operator-loaded plugin); the body opens databases WITHOUT a version so it never creates or upgrades a schema, refusing a missing one (`storage.NOT_FOUND`) (ADR-018)	A loaded plugin under a renderer-eval grant can read or mutate IndexedDB records; the operator chose both. IndexedDB record values are returned verbatim by default (opt-in `redactValues` masks them; treat reads as sensitive if the app stores tokens there); structured-clone values that are not JSON (Blob/ArrayBuffer/circular) are returned as a typed placeholder. No schema creation/upgrade. Playwright launch + CDP attach; the injector returns `storage.UNSUPPORTED`
Native UI read via the menu seam	`native_menu` / `native_menu_item` READ the application menu via a fixed main-process serializer over `Menu.getApplicationMenu()` (data fields only — the items' `click` handlers and internal refs are never read), bounded by the `canAccessNativeUI` capability and a first-party, operator-loaded plugin; they run no agent JavaScript and are not `--allow-eval` gated (ADR-019)	Observation of app chrome, not a modify and not a secret surface (menu labels are no more sensitive than the DOM text a snapshot already exposes). Playwright launch transport only; tray read requires the launch-time instrumentation row below; tray event invocation is the modify row below
App behaviour altered by menu invocation	`native_menu_invoke` MODIFIES app behaviour: it fires the app's own menu `click` handler (the native-UI analog of `electron_click` firing a DOM handler), bounded by the `canAccessNativeUI` capability and a first-party, operator-loaded plugin; the agent supplies a path (data), not code, so it runs no agent JavaScript and is not `--allow-eval` gated; a disabled item is refused and a built-in role item is not invokable (ADR-019)	A loaded plugin can trigger app-defined menu actions; the operator chose to load it. Playwright launch transport only; role-based items cannot be invoked (press the accelerator)
App behaviour altered by tray event invocation	`native_tray_invoke` MODIFIES app behaviour: it fires the app's own `tray.on(event, …)` handler by `emit`ting a `click` / `right-click` / `double-click` (or platform `mouse-` / `balloon-click`) event on the live `Tray` from the launch-time registry (the tray analog of the menu-invocation row), bounded by the `canAccessNativeUI` capability + the `instrumentNative` launch opt-in + a first-party, operator-loaded plugin*; the agent supplies a tray id + an event name (data), not code, so it is not `--allow-eval` gated; a tray with no listener for the event is refused, not faked (ADR-019)	A loaded plugin can trigger app-defined tray actions; the operator chose to load it AND to launch with `instrumentNative`. Playwright launch transport only; firing `right-click` runs the handler but does not auto-open the native context menu
Native notification capture	`native_notifications_` OBSERVE the notifications the app shows by patching `Notification.prototype.show` in the main process (recording only the data fields — title/body/subtitle/silent/urgency — never handlers or refs), either at arm time or at launch t=0 when `instrumentNative` installed the fixed hook, bounded by the `canAccessNativeUI` capability and a first-party, operator-loaded plugin*; the agent supplies only arm/read/stop (no executable input), so it is NOT `--allow-eval` gated (ADR-019)	An observe surface (user-facing notification text the app already displays); a loaded plugin can read shown notifications, the operator chose to load it. Playwright launch transport only. Notifications shown before capture is armed are missed UNLESS the session was launched with `instrumentNative` (the launch shim installs the same hook at t=0, so startup notifications are captured and tagged `beforeArm`); under a `titleContains` filter the buffer records all and filters at read, so a very noisy app could evict matching startup ones past the cap
App main entry wrapped by launch-time instrumentation	`electron_launch { main, instrumentNative: true }` (default OFF) wraps the app's main with fixed, transport-owned hooks that install the Tray registry and the startup-notification recorder before the app runs, then loads the real main; the hook bodies are fixed source strings (no agent code), the real-main path is the operator's own preflighted entry (JSON-escaped into a file Electron runs, never `eval`), and the shim is removed on stop (ADR-020)	A launch-mechanism opt-in the operator sets per session (not implied by loading the plugin), bounding the shim's blast radius to opt-in sessions; executablePath-only launches cannot be instrumented; the wrapped main sees `process.argv[1]` pointing at the shim. Playwright launch transport only
Path traversal / arbitrary process launch	`--app-root` confines `main` / `executablePath` / `cwd` and blocks `..` escape; runtime-altering env vars (`NODE_OPTIONS`, `LD_`, `DYLD_`, …) are refused	Without `--app-root`, launch paths are unconstrained (local-tool model)
Protocol-channel corruption	stdout is JSON-RPC only; all diagnostics go to stderr, enforced by a CI gate	—
Denial of service via a hung app	A per-operation timeout backstop (ADR-011) abandons a non-settling handler and returns a retryable error	The abandoned op dies with the session
Secret exfiltration via captured data and artifacts	Trace and IPC captures support `redact` for structured argument/payload fields; screenshots and trace artifacts are written only where the operator points them	Screenshots, console output, tool results, and unredacted payloads can contain secrets
Prototype-pollution via untrusted string lookups	Lookups keyed by tool input guard against inherited `Object.prototype` members	—
Catastrophic-backtracking regex (ReDoS) in `expect`/`assert` predicates	Predicate flags are validated as defence-in-depth	Not a complete decision procedure

The eval checks, precisely

A substring blocklist scans eval source for: process.exit, require(, eval(, Function(, __proto__, child_process. It is intentionally minimal — it catches the obvious foot-guns that should stay blocked even when the eval tools are visible.

Structural inspection. Beyond the substring scan, each payload is parsed and walked as an AST, so the same dangerous constructs are matched in the parse tree even when formatting or computed access hides them from a text scan: process . exit, process['exit'], eval ('…'), the constructor-Function escape ([].constructor.constructor('…')()), and dynamic import(). A hit is EVAL_BLOCKED_CONSTRUCT, carrying the construct and the same code_hash. If the payload does not parse, the AST pass defers to the substring scan and the remote eval — never worse than the blocklist alone.

What the checks do NOT catch. Both passes are static and conservative. A key built at runtime (globalThis['pro'+'cess']), an aliased reference (const f = Function; f('…')), or a payload assembled from strings still gets through. This is deliberate: an honest, narrow check beats a broad one that over-claims and false-positives on legitimate code. The --allow-eval opt-in plus the "privileged local tool" trust boundary stay the real controls — the checks raise the floor, they are not a wall.

The gate is per-target. --allow-eval accepts targets: bare --allow-eval enables both, while --allow-eval=main or --allow-eval=renderer enable only one. Each eval tool registers only when its target is permitted, so a renderer-only automation never exposes the main-process surface (full Node/Electron). A plugin that reaches the main process through the eval seam (IPC capture) is gated on the main target too, so it is unavailable under a renderer-only policy.

Every eval is audited. A stderr breadcrumb records each call — tool, target, session, code length, and a code_hash (an FNV-1a of the payload, never the payload itself). A blocked EVAL_BLOCKED_KEYWORD error carries the same code_hash, so a rejected payload can be correlated with the logs without ever being recorded.

Residual risks and recommendations

No static check of eval is sound. Per-target authorization, the content-hash audit, and structural (AST) inspection have all shipped, but a payload built from runtime strings or dynamic access still defeats both the blocklist and the AST pass. Treat the eval checks as defence-in-depth, not a guarantee — the --allow-eval opt-in and the trust boundary are the controls that matter.
Do not expose the server to an untrusted agent host, and do not put a network transport in front of it. The supported model is a local stdio child process driven by a host you trust.
Configure redact for structured trace arguments and IPC payload fields that can carry credentials, tokens, or PII before capturing. It is not a screenshot, console-output, or arbitrary-result scrubber. See Capture diagnostics.
Keep network-capture redaction on. network_capture_start is bounded to a URL allowlist and redacts authorization / cookie / set-cookie by default; only turn redactSecureDefaults off when you genuinely need those headers, and add app-specific secret headers via redactHeaders. Bodies are captured only when you set captureBodies; leave it off (or use captureBodies: "size" for length-only, or redactBodies to drop content) unless you genuinely need payload content.
Keep cookie-value redaction on. storage_cookies / storage_snapshot redact cookie values by default; only set revealValues: true when you genuinely need a value (a cookie value can be a session/auth token). Cookie names, domains, paths, and flags are always shown. Note localStorage snapshot values are not redacted (they are app state) — treat the snapshot output as sensitive if your app stores tokens in localStorage. IndexedDB record values are also returned verbatim by default; set the storage plugin's redactValues: true when you need record presence/shape but not the values.
Set --app-root when launching untrusted or agent-chosen app paths, to confine the launch surface.

Deploying safely — checklist

Run the server as a local stdio child of a trusted host. Do not expose it on a network.
Leave --allow-eval off unless a flow genuinely needs it; prefer the granular tools. When you do need it, grant the narrowest target: --allow-eval=renderer for page-state flows, and --allow-eval=main only when a flow truly needs Node-level access in the app's main process.
Set --app-root to the project you are testing.
Configure redact for sensitive channels/traces; write artifacts to a directory you control.
Treat the agent's tool inputs as untrusted — the server does, but your host should not relay inputs from an untrusted source.

The full per-tool contracts (including which tools require --allow-eval) are in the generated TOOL-REFERENCE.md.