Electron Stagewright docs

ADR-003: Transport Abstraction

Context

Electron Stagewright is an MCP server that drives Electron desktop applications. Before any tool can be written (click, type, snapshot, eval, etc.), the project needs a settled answer to "how does the server actually talk to the running Electron process?"

Three viable mechanisms exist, each with distinct trade-offs:

  1. Playwright _electron — Microsoft's experimental wrapper around the Chromium driver. Convenient, well-typed, but marked experimental and could be deprecated. See Playwright MCP PR #1291 for the upstream-deprecation signal that informs this ADR.
  2. Raw Chrome DevTools Protocol (CDP) — the stable public protocol that Chrome DevTools itself uses. Lower-level, requires hand-rolling WebSocket + JSON-RPC plumbing, but doesn't depend on any upstream wrapper that could go away.
  3. Inject Node Inspector into a running process — best ergonomic for developers (no pre-flag required, attach to an app that's already running) but experimental, platform-dependent, and uses process._debugProcess whose Windows behaviour is less reliable than POSIX.

If the server's tools hardcode any single mechanism, they inherit its limitations forever. If they leave the choice ad-hoc per tool, the plumbing duplicates N times and inconsistencies between tools surface as confusing failures. This ADR locks the contract that every tool dispatches through.

Decision

1. Single contract: ITransport

export interface ITransport {
  readonly id: TransportId // 'playwright-electron' | 'cdp' | 'injector'
  readonly capabilities: TransportCapabilities

  launch(opts: LaunchOptions): Promise<TransportSession>
  attach(opts: AttachOptions): Promise<TransportSession>
  inject(opts: InjectOptions): Promise<TransportSession>
  stop(session: TransportSession, opts?: StopOptions): Promise<void>
  forceKill(session: TransportSession): Promise<void>
}

export interface TransportSession {
  readonly id: SessionId
  readonly transport: TransportId

  evaluate<T>(target: 'main' | 'renderer', body: string, arg?: unknown): Promise<T>
  screenshot(target: WindowRef, opts?: ScreenshotOptions): Promise<Buffer>
  windowsList(): Promise<readonly WindowDescriptor[]>
  readonly ipc: IpcChannel
  readonly console: ConsoleStream

  /** Idempotent — calling twice does not throw, does not double-free. */
  dispose(): Promise<void>
}

Every tool dispatches through ITransport. The transport implementation can change without touching tools, plugins, or examples; the contract is the seam.

2. Capability matrix

Every transport declares its capabilities up front via a TransportCapabilities record. The dispatcher inspects the matrix BEFORE invoking a method and refuses unsupported operations with TRANSPORT_UNSUPPORTED (registered code from the central error registry) instead of crashing partway through the SDK with a vague Playwright/CDP error.

export interface TransportCapabilities {
  readonly canLaunch: boolean
  readonly canAttach: boolean
  readonly canInject: boolean
  readonly canIntercept: boolean // network / IPC mid-flight modification
  readonly canControlClock: boolean
  readonly supportsMainEval: boolean
  readonly supportsRendererEval: boolean
}

A helper assertCapability(transport, capability) is exported alongside the interface so tool handlers can refuse-when-unsupported with a single line.

3. Three implementations

Transport canLaunch canAttach canInject canIntercept canControlClock supportsMainEval supportsRendererEval
PlaywrightElectronTransport
CDPTransport
InjectorTransport

PlaywrightElectronTransport ships as the only fully-implemented transport in this slice. It uses Playwright's experimental _electron.launch() API loaded via dynamic import so the playwright peer dependency stays optional: consumers can install @electron-stagewright/core without playwright and still import the package; only invoking launch() surfaces a TRANSPORT_UNSUPPORTED error with a clear remediation hint.

Deviation from the original scope: the design draft assumed PlaywrightElectronTransport would also support canAttach: true. After investigation, Playwright's _electron does NOT expose a public attach API — it only exposes launch. Attach behaviour is delegated to CDPTransport (which connects to an already-running app via its CDP endpoint). The Playwright transport now declares canAttach: false and its attach() method rejects with TRANSPORT_UNSUPPORTED. The capability matrix above reflects this corrected reality.

CDPTransport and InjectorTransport ship as stubs in this slice. Their constructors succeed, their capability matrices are declared honestly, and every method rejects with a registered error code (TRANSPORT_UNSUPPORTED when the capability matrix already refuses; NOT_IMPLEMENTED when the capability is claimed but the body is deferred). The point of shipping the stubs now is to force every downstream slice to honour the contract — tools cannot reach into transport-specific behaviour because there is no transport-specific behaviour to reach into yet.

4. CDP connection pool (design captured, implementation deferred)

When CDPTransport's body lands in a future slice, the connection pool design adopted from prior art (laststance/electron-mcp-server/src/utils/cdp-pool.ts) is:

None of this ships in the current slice; the design lives here so the CDP-implementation slice has a contract to honour.

5. Eval payload validation lives in the dispatcher, not the transport

The transport's evaluate() method does NOT validate the body string against the eval blocklist (see ADR-006). The dispatcher invokes routeByOperationType(operationType, payload) BEFORE calling transport.evaluate(), and operationType: 'eval' flows through validateEvalContent which screens the keyword blocklist. Direct callers (tests, application code) that bypass the dispatcher inherit responsibility for validating untrusted payloads.

The current evaluate() implementation wraps the body in a function string using positional parameter names (async (electronApp, arg) => { ${body} } for main, async (arg) => { ${body} } for renderer). A malicious or malformed body string CAN break out of the wrapper. The robust protocol (AST inspection, structured eval messages instead of string concatenation) lands with the eval-tool ADR and the threat-model ADR. The string wrapper here is intentionally minimal.

Rationale

Why three implementations behind one interface

A single implementation locks the project to one vendor's roadmap. The Playwright deprecation signal makes this concrete: if Microsoft removes _electron, the project either rewrites every tool against CDP or dies. With three implementations behind ITransport, the dispatcher can swap implementations transparently. The cost is one interface definition + capability matrix; the benefit is multi-year survivability.

Why a capability matrix instead of dynamic feature detection

Boot-time matrix inspection is cheap (a property read) and lets the dispatcher refuse-when-unsupported at the first opportunity. Dynamic feature detection (try the call, catch the error, fall back) burns at least one round-trip per failure and surfaces transport-specific exceptions to tools. The matrix is also self-documenting: a contributor reading CDPTransport sees canLaunch: false in the constructor and immediately understands why launch() rejects.

Why dynamic await import('playwright')

playwright is declared as an OPTIONAL peer dependency. A consumer installing @electron-stagewright/core and NEVER using PlaywrightElectronTransport should not be forced to install Playwright. Static import at module-load time crashes the package import for those consumers; dynamic await import('playwright') defers the failure until the first launch() call, at which point the failure is structured (TRANSPORT_UNSUPPORTED with an install-instruction hint) instead of a raw module-not-found crash.

Why ship the CDP / Injector stubs now

Two reasons:

  1. The capability matrix becomes load-bearing immediately. Downstream slices that need attach (the future "attach-without-restart" Brecha A work) will read cdp.capabilities.canAttach === true and consume that as a contract. Shipping the stubs now means slices can be planned against the real capability matrix instead of pseudo-code.
  2. The seams are the security surface, not the bodies. Once routeByOperationType + assertCapability exist as the single entry points, tool implementations cannot accidentally bypass them. Shipping the bodies as stubs that throw NOT_IMPLEMENTED is more honest than not shipping the classes at all — the capability matrix lies if it claims canAttach: true and the class doesn't exist.

Why dispose() is idempotent

The dispatcher may call dispose() during normal shutdown AND during error recovery. A non-idempotent dispose() produces double-free crashes in the recovery path. The contract is documented at the interface level; the Playwright session honours it via a disposed: boolean flag; the test fake demonstrates the same shape.

Alternatives considered

Alternative Why rejected
Hardcode Playwright _electron everywhere One-vendor risk. Microsoft has signalled the API is experimental; rewriting every tool against CDP after they ship is far more expensive than defining the interface up front.
Hardcode raw CDP from day 1 ~1500 LOC of WebSocket + JSON-RPC + types just to type a single click(). Playwright wraps the same surface in well-tested helpers. Burn the cost only when forced to.
Per-tool transport choice (no abstraction) Plumbing duplicates N times across N tools. Inconsistencies between tools surface as confusing failures for agents ("why does click work but scroll fail?").
chrome-remote-interface library inside CDPTransport Considered; adds a dependency we may not need long-term. The CDP-implementation slice will spike chrome-remote-interface vs hand-rolling against the pool design above and decide then. Not decided in this slice.
Make the capability matrix dynamic (computed per-session) Boot-time matrix is enough for the dispatcher's needs. A dynamic matrix would require every consumer to wait for session creation before knowing what the transport can do, which defeats the purpose of cheap upfront refusal. If a future use case requires per-session capability variance, we add it as additive metadata; the static matrix remains the baseline contract.
Replace NotImplementedError class with StagewrightError('NOT_IMPLEMENTED', …) Adopted. The project's error infrastructure already has StagewrightError keyed on ErrorCode. Inventing a parallel class fragments the error hierarchy and confuses the mirror test.

Consequences

Amendment (2026-05-28): Interaction surface

The original contract covered observation (evaluate, screenshot, windowsList) but no real user input. Driving an Electron app requires click/type/hover/drag/scroll, so the contract is extended additively. The seam is unchanged — tools still dispatch through ITransport / TransportSession; this amendment adds methods, it does not alter the existing ones.

New capability flag

TransportCapabilities gains an eighth flag:

/**
 * The transport can perform real user input (click, type, hover, drag, …) on a
 * renderer element. A transport declaring this `false` rejects those methods.
 */
readonly supportsInteraction: boolean

Per-transport values:

Transport supportsInteraction Notes
PlaywrightElectronTransport Fully implemented against Playwright's Page action API.
CDPTransport Historical value for this amendment; the 2026-06-10 status update records the current true implementation.
InjectorTransport Node Inspector has no renderer-input surface on its own.

Adding a capability flag is a backwards-incompatible change to TransportCapabilities (every transport must update its matrix in the same change), exactly as the original Consequences section anticipated.

New session methods

TransportSession gains nine methods, all operating on the active/default window with real user input: click, fill, hover, press, selectOption, setChecked, setInputFiles, dragTo, and scroll. Three option types support them:

No-match must be observable

Eight of the nine methods delegate to Playwright actions that already reject when the selector matches nothing. scroll's into-view path runs in the renderer (the minimal page surface intentionally does not expose Playwright locators), so it explicitly reports whether the element was found and rejects with SELECTOR_NO_MATCH on a miss. A silent success here would let the tool layer report a phantom scroll it cannot diagnose — every interaction method surfaces a missing target uniformly.

Scope of this slice

This amendment ships the contract plus the Playwright implementation only. The agent-facing interaction tools (and the ref[data-sw-ref="…"] resolution they perform before reaching the transport) land in the following slice; the CDP/Injector bodies remain deferred and continue to reject with NOT_IMPLEMENTED once their sessions exist.

Follow-up (tool layer): two additive surface refinements

The tool-layer slice that builds on this amendment added two backwards-compatible methods to the interaction surface, both implemented in PlaywrightElectronTransport and recorded by the test fake:

Both are additive — existing callers are unaffected, and supportsInteraction already gates them. The CDP/Injector stubs declare supportsInteraction: false and gain no method bodies.

The tool layer also established a shared resolver (ref/selector → one selector), a bounded per-action timeout (default 5s, clamp 30s), a raw-throw → registered-code classifier (mirroring the launch-error diagnoser, e.g. a Playwright "element is not enabled" message → ELEMENT_DISABLED), a ref-freshness guard against the stored snapshot, and similar_refs candidates sourced from a fresh live walk on a miss.

Amendment (2026-05-30): Console-output buffer

The observation tool slice (electron_screenshot + electron_console_logs) needed the transport to surface renderer console output. Screenshots already had a method (screenshot(window, opts)); console output did not, because console messages are events — they arrive asynchronously while the app runs, and a query-time pull cannot retroactively observe a console.log that already fired. So the contract gains a small capture buffer.

New session method

interface ConsoleEntry {
  readonly type: string // 'log' | 'info' | 'warning' | 'error' | 'debug' | ...
  readonly text: string
  readonly timestamp: number // epoch ms
  readonly location?: { url?: string; line?: number; column?: number }
}

interface ConsoleLogsResult {
  readonly entries: readonly ConsoleEntry[]
  readonly overflowed: number // count of older entries the buffer dropped
}

// on TransportSession:
consoleLogs(): Promise<ConsoleLogsResult>

Capture model

Scope and deferrals

dialog_handler (the other event-driven surface originally bundled with this slice) is deferred to its own follow-up so this console-buffer amendment ships isolated from a page.on('dialog') amendment.

Amendment (2026-05-31): Dialog handling

The deferred follow-up to the console-buffer amendment. Native JS dialogs (alert / confirm / prompt / beforeunload) block the renderer until something answers, so — like console output — they cannot be observed by a query-time pull, and unlike console they require an active response. The contract gains a small forward-looking auto-responder plus a capture buffer.

New types and session methods

type DialogAction = 'accept' | 'dismiss'
type DialogType = 'alert' | 'confirm' | 'prompt' | 'beforeunload'

interface DialogPolicy {
  readonly action: DialogAction // default for any unmatched dialog
  readonly promptText?: string // submitted to prompt() when its effective action is accept
  readonly perType?: Partial<Record<DialogType, DialogAction>> // per-kind overrides; falls back to action
  readonly oneShot?: boolean // resolve exactly one dialog, then revert to the dismiss default
}

interface DialogEvent {
  readonly type: string
  readonly message: string
  readonly action: DialogAction // how the responder resolved it
  readonly defaultValue?: string // prompt()'s default, when non-empty
  readonly promptText?: string // text submitted to a prompt() accept
  readonly timestamp: number // epoch ms
}

interface DialogEventsResult {
  readonly entries: readonly DialogEvent[]
  readonly overflowed: number
  readonly policy: DialogPolicy // the policy currently in effect
}

// on TransportSession:
setDialogPolicy(policy: DialogPolicy): Promise<void>
dialogEvents(opts?: { clear?: boolean }): Promise<DialogEventsResult>

Capture and response model

Scope and deferrals

References

Status Update — 2026-06-10

The CDP transport's connection pool and the Injector transport's Node-inspector attach/inject paths are now implemented. Earlier sections remain the historical record of the contract-first slice that shipped the stubs; the current implementation status is below.

Status Update — 2026-06-16: Network capture seam (canIntercept's first consumer)

The canIntercept capability, dormant since this ADR reserved it, gains its first consumer (see ADR-016). TransportSession is extended with an ARMED network-capture seam — startNetworkCapture, networkEvents, stopNetworkCapture — alongside the always-on console/dialog buffers.

So the plugin's capability gate refuses both CDP and injector sessions with network.UNSUPPORTED (naming the Playwright transport), while NOT_IMPLEMENTED remains the contract-level signal for a direct caller that ignores the capability. Capture rides this seam rather than the eval seam (ADR-010's approach) because protocol-level network is invisible to evaluate, and so it is NOT --allow-eval gated.

Status Update — 2026-06-18: canIntercept's second consumer (CDP network seam)

The seam reserved above is now wired on the CDP transport too, so canIntercept flips from false to true on CDP and the capability is honest on both attach-mode and launch-mode (see ADR-016).

The canIntercept capability now has TWO honest implementers; it is the per-transport gate the network plugin reads, and a transport advertises it only once the whole seam is wired.

Status Update — 2026-06-19: canControlClock's first consumer (clock seam)

The canControlClock capability, reserved in this ADR's matrix and previously declared without a consumer (the CDP transport even advertised it aspirationally true), gains its first consumer: a clock seam on TransportSessioninstallClock / setFixedTime / setSystemTime / advanceClock / runClockFor / pauseClockAt / resumeClock — driven by the clock plugin (see ADR-017).

This is the same lesson canIntercept taught, applied proactively: a capability is advertised true only where the whole seam is honestly wired, and an aspirational true (CDP's) is corrected to false the moment the capability gains a real consumer.

Status Update — 2026-06-19: canAccessStorage's first consumers (storage seam)

A canAccessStorage capability is added to TransportCapabilities and immediately consumed by a storage seam on TransportSessiongetCookies / setCookie / clearCookies / storageSnapshot (plus the types StorageCookie / CookieFilter / StorageOrigin / StorageSnapshot) — driven by the storage plugin (see ADR-018). Unlike canControlClock, this capability has TWO honest implementers from the start:

This is the inverse of the clock lesson and the reason the matrix carries canIntercept-style nuance: where the clock seam could only be honestly wired on Playwright, the storage seam is honestly wired on both observe-capable transports, so both advertise true — and the one partial (CDP localStorage) is documented rather than hidden behind a rejecting method.

Status Update — 2026-06-19: canAccessNativeUI's first consumer (native-UI seam)

A canAccessNativeUI capability is added to TransportCapabilities and consumed by a native-UI read seam on TransportSessiongetApplicationMenu() (plus the types NativeMenu / NativeMenuItem) — driven by the native-UI plugin (see ADR-019). Like the clock seam, it is Playwright-only:

The lesson is the same the matrix keeps teaching: a capability is advertised true only where the whole seam is honestly wired. The application menu lives in the main-process Node context, which only the Playwright electronApp.evaluate path reaches, so only it advertises true.

Status Update — 2026-06-19: LaunchOptions.instrumentNative

LaunchOptions gains an optional instrumentNative flag (default off). When set on the Playwright launch transport, the transport wraps the app's main entry with fixed hooks installed before the app runs, so native state created at startup (the system Tray, and notifications shown in app.whenReady()) is observable — see ADR-020 for the mechanism and threat reasoning. It requires a main/appPath entry (executablePath-only launches cannot be wrapped). It is a launch-transport-only opt-in; CDP/injector cannot wrap a running app's entry, so the consuming seam (getTrays and invokeTrayEvent) rejects NOT_IMPLEMENTED there. The notification-capture seam adopts the launch-installed t=0 hook when present, so on an instrumented session it also returns startup notifications (tagged beforeArm).

The native-UI seam also gains invokeTrayEvent(id, event) (the tray analog of invokeApplicationMenuItem, returning a TrayInvokeResult): it acts on the same launch-time tray registry, so like getTrays it resolves null on a session launched without instrumentNative and is Playwright-only (CDP/injector reject NOT_IMPLEMENTED). See ADR-019's tray-invocation Status Update.

Status Update — 2026-06-22: supportsRendererEval gains a plugin consumer (per-key storage)

supportsRendererEval previously had a single consumer — the core electron_eval_renderer tool. The storage plugin's new per-key localStorage / sessionStorage tools (ADR-018 Status Update) become its first plugin consumer: they reach TransportSession.evaluate('renderer', …) directly with a fixed source string, so they require supportsRendererEval (and the operator's --allow-eval=renderer grant). No new seam method or capability is added — this consumes the existing evaluate seam and supportsRendererEval flag. Both observe-capable transports already advertise it: the Playwright transport via page.evaluate, the CDP transport via Runtime.evaluate against a page target; the injector keeps supportsRendererEval: false, so the per-key tools return storage.UNSUPPORTED there — the same two-implementers-plus-injector shape the storage seam itself has.