Security model and threat model
This page is the canonical threat model for the Electron Stagewright MCP server. It states what the server can touch, who it trusts, what stops misuse, and what risk remains. If you are deciding whether to point an agent at the server, read this first. The posture summarised here is recorded as a decision in ADR-014; to report a vulnerability see SECURITY.md.
The one-line model
The server is a privileged local tool, not a sandbox. It runs with your OS
privileges, drives a real desktop app, and — when you enable the --allow-eval
policy — runs arbitrary JavaScript inside that app. Treat it the way you would treat
a shell: only let a trusted agent host invoke it. The default transport is stdio
(a local child process), so the trust boundary stays local unless you deliberately
put a network in front of it.
Assets
What an attacker would want, in rough order of value:
- The host machine. The server can launch processes and read files within its launch surface, with the operator's privileges.
- The target app's runtime. Under the
--allow-evalpolicy, arbitrary main- and/or renderer-process code; without it, the granular tools still drive the app (click, type, navigate). - Captured data. Screenshots, console logs, and session traces can contain secrets the app displayed; IPC capture can record channel payloads; network capture can record request/response headers (and, when opted in, bodies).
- Code-signing identity. The
production_validatetool reads signed.appbundles and their updater feeds, and may return bounded evidence such as a signing authority in its local tool result.
Trust boundaries
- Agent host → server. The agent supplies every tool input. Inputs are treated as untrusted and possibly hostile (a hallucinating or prompt-injected agent).
- Server → target app. The server drives the app and, under eval, runs code in it. The app is assumed at least semi-trusted (it is the thing under test).
- Server → host filesystem. Launch paths, screenshot output, and trace artifacts touch disk.
Threat actors
- A misbehaving agent — hallucinated or prompt-injected tool calls. The primary actor the controls below target.
- A malicious app under test — could try to abuse the driving channel. Out of primary scope (you chose to test it), but the server avoids handing it the protocol channel or unbounded waits.
- A local reader of artifacts — anyone who can read the trace/screenshot output directory.
Controls (threats × mitigations)
| Threat | Control | Residual |
|---|---|---|
| Arbitrary code execution via eval | electron_eval_main / electron_eval_renderer are unregistered unless --allow-eval permits their target (per-target least privilege — --allow-eval=renderer grants only the renderer); payloads pass a keyword blocklist and a structural AST check; calls are audited to stderr (length + a content hash, never the payload); results are size-capped |
The eval checks are defence-in-depth, bypassable by a determined payload — see below |
| A plugin running main-process code behind the operator's back | Any plugin using the eval seam (transport.evaluate('main')) re-asserts the main eval opt-in (--allow-eval=main, or bare --allow-eval) at its own tool boundary; today that covers ipc_capture_start, ipc_captured, ipc_capture_stop, ipc_invoke, and ipc_stub (ADR-010) |
— |
| Over-broad IPC capture / injection | ipc_capture_start requires an explicit channel allowlist; ipc_stub is allowlist-bound; ipc_invoke has an optional allowlist; redact drops named fields |
Capture defaults are not redacted unless configured |
| Secret headers or bodies via over-broad network capture | network_capture_start requires an explicit URL allowlist (no capture-everything); authorization / cookie / set-cookie are redacted by default (redactHeaders adds more); bodies are opt-in (captureBodies, off by default) and, when on, bounded by a byte cap + a text-ish content-type gate, and droppable to size-only or redactBodies (ADR-016) |
A careless allowlist with redactSecureDefaults: false can still surface header values; an opted-in captureBodies surfaces body content (not value-redacted unless redactBodies); renderer page-target traffic only (Playwright launch-mode and CDP attach-mode), not the main process's net module |
| App input altered by network stubbing | network_stub MODIFIES what the app receives (fulfill/abort), so it is bounded the same way: an explicit URL allowlist (no stub-everything), the canIntercept capability, and a first-party, operator-loaded plugin; it runs no app JavaScript and is not --allow-eval gated (ADR-016) |
A loaded plugin can alter allowlisted responses; the operator chose to load it. Renderer page-target traffic only (Playwright launch-mode and CDP attach-mode) |
| App behaviour altered by clock control | clock_* MODIFIES the time the app sees (install / freeze / advance the fake clock), so it is bounded by the canControlClock capability and a first-party, operator-loaded plugin; it runs no app JavaScript and is not --allow-eval gated, and is not a secret surface (ADR-017) |
A loaded plugin can drive the app's clock; the operator chose to load it. Playwright launch transport only |
| Cookie secrets via the storage read paths | storage_cookies / storage_snapshot redact cookie values by default (replaced with [redacted]; names/domains/paths/flags are kept); only revealValues: true surfaces them. Bounded by the canAccessStorage capability and a first-party, operator-loaded plugin; runs no app JavaScript and is not --allow-eval gated (ADR-018) |
With revealValues: true the agent sees cookie values verbatim (a session/auth token can be one); localStorage snapshot values are NOT redacted (app state — treat the snapshot as sensitive if your app stores tokens there); cookies + the visited origins' localStorage snapshot only (Playwright launch full; CDP attach cookies full, localStorage best-effort) |
| App state altered by storage writes | storage_set_cookie / storage_clear_cookies MODIFY app state (seed/clear a cookie), so they are bounded the same way: the canAccessStorage capability and a first-party, operator-loaded plugin; they run no app JavaScript and are not --allow-eval gated (ADR-018) |
A loaded plugin can seed or clear cookies; the operator chose to load it. Per-key localStorage / sessionStorage and IndexedDB writes are the renderer-eval rows below |
| Per-key Web Storage via renderer eval | storage_local_* / storage_session_* (get/set/remove/keys/clear) read and mutate a single localStorage / sessionStorage key by running a fixed renderer body (the agent supplies op/scope/key/value as DATA, never code), so they are renderer-eval gated: unregistered unless --allow-eval=renderer (or bare --allow-eval) permits the renderer target (the dispatcher hides them otherwise) AND re-asserted at the tool boundary (storage.EVAL_REQUIRED); also bounded by the supportsRendererEval capability and a first-party, operator-loaded plugin (ADR-018) |
A loaded plugin under a renderer-eval grant can read or mutate Web Storage; the operator chose both. Web Storage values are NOT redacted (app state — treat reads as sensitive if the app stores tokens there). Playwright launch + CDP attach (supportsRendererEval); the injector returns storage.UNSUPPORTED. IndexedDB is the row below |
| IndexedDB read/write via renderer eval | storage_idb_* (schema/get/keys/count/set/delete/clear) read and mutate records in existing databases / object stores via a fixed async renderer body (the agent supplies database/store/key/value as DATA, never code), renderer-eval gated exactly like the Web Storage row (registration gate + storage.EVAL_REQUIRED re-assert + supportsRendererEval + operator-loaded plugin); the body opens databases WITHOUT a version so it never creates or upgrades a schema, refusing a missing one (storage.NOT_FOUND) (ADR-018) |
A loaded plugin under a renderer-eval grant can read or mutate IndexedDB records; the operator chose both. IndexedDB record values are returned verbatim by default (opt-in redactValues masks them; treat reads as sensitive if the app stores tokens there); structured-clone values that are not JSON (Blob/ArrayBuffer/circular) are returned as a typed placeholder. No schema creation/upgrade. Playwright launch + CDP attach; the injector returns storage.UNSUPPORTED |
| Native UI read via the menu seam | native_menu / native_menu_item READ the application menu via a fixed main-process serializer over Menu.getApplicationMenu() (data fields only — the items' click handlers and internal refs are never read), bounded by the canAccessNativeUI capability and a first-party, operator-loaded plugin; they run no agent JavaScript and are not --allow-eval gated (ADR-019) |
Observation of app chrome, not a modify and not a secret surface (menu labels are no more sensitive than the DOM text a snapshot already exposes). Playwright launch transport only; tray read requires the launch-time instrumentation row below; tray event invocation is the modify row below |
| App behaviour altered by menu invocation | native_menu_invoke MODIFIES app behaviour: it fires the app's own menu click handler (the native-UI analog of electron_click firing a DOM handler), bounded by the canAccessNativeUI capability and a first-party, operator-loaded plugin; the agent supplies a path (data), not code, so it runs no agent JavaScript and is not --allow-eval gated; a disabled item is refused and a built-in role item is not invokable (ADR-019) |
A loaded plugin can trigger app-defined menu actions; the operator chose to load it. Playwright launch transport only; role-based items cannot be invoked (press the accelerator) |
| App behaviour altered by tray event invocation | native_tray_invoke MODIFIES app behaviour: it fires the app's own tray.on(event, …) handler by emitting a click / right-click / double-click (or platform mouse-* / balloon-click) event on the live Tray from the launch-time registry (the tray analog of the menu-invocation row), bounded by the canAccessNativeUI capability + the instrumentNative launch opt-in + a first-party, operator-loaded plugin; the agent supplies a tray id + an event name (data), not code, so it is not --allow-eval gated; a tray with no listener for the event is refused, not faked (ADR-019) |
A loaded plugin can trigger app-defined tray actions; the operator chose to load it AND to launch with instrumentNative. Playwright launch transport only; firing right-click runs the handler but does not auto-open the native context menu |
| Native notification capture | native_notifications_* OBSERVE the notifications the app shows by patching Notification.prototype.show in the main process (recording only the data fields — title/body/subtitle/silent/urgency — never handlers or refs), either at arm time or at launch t=0 when instrumentNative installed the fixed hook, bounded by the canAccessNativeUI capability and a first-party, operator-loaded plugin; the agent supplies only arm/read/stop (no executable input), so it is NOT --allow-eval gated (ADR-019) |
An observe surface (user-facing notification text the app already displays); a loaded plugin can read shown notifications, the operator chose to load it. Playwright launch transport only. Notifications shown before capture is armed are missed UNLESS the session was launched with instrumentNative (the launch shim installs the same hook at t=0, so startup notifications are captured and tagged beforeArm); under a titleContains filter the buffer records all and filters at read, so a very noisy app could evict matching startup ones past the cap |
| App main entry wrapped by launch-time instrumentation | electron_launch { main, instrumentNative: true } (default OFF) wraps the app's main with fixed, transport-owned hooks that install the Tray registry and the startup-notification recorder before the app runs, then loads the real main; the hook bodies are fixed source strings (no agent code), the real-main path is the operator's own preflighted entry (JSON-escaped into a file Electron runs, never eval), and the shim is removed on stop (ADR-020) |
A launch-mechanism opt-in the operator sets per session (not implied by loading the plugin), bounding the shim's blast radius to opt-in sessions; executablePath-only launches cannot be instrumented; the wrapped main sees process.argv[1] pointing at the shim. Playwright launch transport only |
| Path traversal / arbitrary process launch | --app-root confines main / executablePath / cwd and blocks .. escape; runtime-altering env vars (NODE_OPTIONS, LD_*, DYLD_*, …) are refused |
Without --app-root, launch paths are unconstrained (local-tool model) |
| Protocol-channel corruption | stdout is JSON-RPC only; all diagnostics go to stderr, enforced by a CI gate | — |
| Denial of service via a hung app | A per-operation timeout backstop (ADR-011) abandons a non-settling handler and returns a retryable error | The abandoned op dies with the session |
| Secret exfiltration via captured data and artifacts | Trace and IPC captures support redact for structured argument/payload fields; screenshots and trace artifacts are written only where the operator points them |
Screenshots, console output, tool results, and unredacted payloads can contain secrets |
| Prototype-pollution via untrusted string lookups | Lookups keyed by tool input guard against inherited Object.prototype members |
— |
Catastrophic-backtracking regex (ReDoS) in expect/assert predicates |
Predicate flags are validated as defence-in-depth | Not a complete decision procedure |
The eval checks, precisely
A substring blocklist scans eval source for: process.exit, require(, eval(,
Function(, __proto__, child_process. It is intentionally minimal — it catches the
obvious foot-guns that should stay blocked even when the eval tools are visible.
Structural inspection. Beyond the substring scan, each payload is parsed and walked
as an AST, so the same dangerous constructs are matched in the parse tree even when
formatting or computed access hides them from a text scan: process . exit,
process['exit'], eval ('…'), the constructor-Function escape
([].constructor.constructor('…')()), and dynamic import(). A hit is
EVAL_BLOCKED_CONSTRUCT, carrying the construct and the same code_hash. If the payload
does not parse, the AST pass defers to the substring scan and the remote eval — never
worse than the blocklist alone.
What the checks do NOT catch. Both passes are static and conservative. A key built at
runtime (globalThis['pro'+'cess']), an aliased reference (const f = Function; f('…')),
or a payload assembled from strings still gets through. This is deliberate: an honest,
narrow check beats a broad one that over-claims and false-positives on legitimate code.
The --allow-eval opt-in plus the "privileged local tool" trust boundary stay the real
controls — the checks raise the floor, they are not a wall.
The gate is per-target. --allow-eval accepts targets: bare --allow-eval
enables both, while --allow-eval=main or --allow-eval=renderer enable only one.
Each eval tool registers only when its target is permitted, so a renderer-only
automation never exposes the main-process surface (full Node/Electron). A plugin that
reaches the main process through the eval seam (IPC capture) is gated on the main
target too, so it is unavailable under a renderer-only policy.
Every eval is audited. A stderr breadcrumb records each call — tool, target,
session, code length, and a code_hash (an FNV-1a of the payload, never the payload
itself). A blocked EVAL_BLOCKED_KEYWORD error carries the same code_hash, so a
rejected payload can be correlated with the logs without ever being recorded.
Residual risks and recommendations
- No static check of eval is sound. Per-target authorization, the content-hash audit,
and structural (AST) inspection have all shipped, but a payload built from runtime
strings or dynamic access still defeats both the blocklist and the AST pass. Treat the
eval checks as defence-in-depth, not a guarantee — the
--allow-evalopt-in and the trust boundary are the controls that matter. - Do not expose the server to an untrusted agent host, and do not put a network transport in front of it. The supported model is a local stdio child process driven by a host you trust.
- Configure
redactfor structured trace arguments and IPC payload fields that can carry credentials, tokens, or PII before capturing. It is not a screenshot, console-output, or arbitrary-result scrubber. See Capture diagnostics. - Keep network-capture redaction on.
network_capture_startis bounded to a URL allowlist and redactsauthorization/cookie/set-cookieby default; only turnredactSecureDefaultsoff when you genuinely need those headers, and add app-specific secret headers viaredactHeaders. Bodies are captured only when you setcaptureBodies; leave it off (or usecaptureBodies: "size"for length-only, orredactBodiesto drop content) unless you genuinely need payload content. - Keep cookie-value redaction on.
storage_cookies/storage_snapshotredact cookie values by default; only setrevealValues: truewhen you genuinely need a value (a cookie value can be a session/auth token). Cookie names, domains, paths, and flags are always shown. NotelocalStoragesnapshot values are not redacted (they are app state) — treat the snapshot output as sensitive if your app stores tokens inlocalStorage. IndexedDB record values are also returned verbatim by default; set the storage plugin'sredactValues: truewhen you need record presence/shape but not the values. - Set
--app-rootwhen launching untrusted or agent-chosen app paths, to confine the launch surface.
Deploying safely — checklist
- Run the server as a local stdio child of a trusted host. Do not expose it on a network.
- Leave
--allow-evaloff unless a flow genuinely needs it; prefer the granular tools. When you do need it, grant the narrowest target:--allow-eval=rendererfor page-state flows, and--allow-eval=mainonly when a flow truly needs Node-level access in the app's main process. - Set
--app-rootto the project you are testing. - Configure
redactfor sensitive channels/traces; write artifacts to a directory you control. - Treat the agent's tool inputs as untrusted — the server does, but your host should not relay inputs from an untrusted source.
The full per-tool contracts (including which tools require --allow-eval) are in the
generated TOOL-REFERENCE.md.