Electron Stagewright docs

ADR-019: Native UI plugin via a transport native-UI seam

Status: Accepted (application-menu read + invoke, notification capture incl. startup/t=0, and tray read + event invocation on the Playwright transport)

Context

An agent driving an Electron app can read and click the web content, but it is blind to the app's native chrome — the application menu (the macOS menu bar: File / Edit / View / Window…), where a large share of desktop-app behaviour lives. There is no way today to ask "is the Save item enabled?", "did Dark Mode get checked under View?", or "does the Edit menu have a Paste item?" — that state lives in the Electron main process (Menu.getApplicationMenu()), outside the DOM snapshot the agent already reads.

The transport capability matrix has had no native-UI surface. This is the native-UI analog of the network (ADR-016), clock (ADR-017), and storage (ADR-018) plugins: the same "a transport seam + a capability gate + a plugin that drives it" shape. The application menu is the first surface because Electron exposes it globally (Menu.getApplicationMenu() returns the live menu), unlike trays and notifications, which have no registry and would need constructor-hook instrumentation.

Decision

1. A dedicated native-UI read seam on the transport, gated by canAccessNativeUI

TransportSession gains a native-UI read seam — getApplicationMenu(): Promise<NativeMenu | null> — plus the types NativeMenu and NativeMenuItem, gated by a new TransportCapabilities.canAccessNativeUI.

The Playwright transport (the default launch transport) implements it: electronApp.evaluate(...) runs a fixed, self-contained serializer in the Electron main process over Menu.getApplicationMenu(), walking the menu tree and returning only the data fields (label, role, type, accelerator, enabled, visible, checked, nested submenu). It flips canAccessNativeUI false → true. role is surfaced so role-based items (e.g. quit, paste, reload) that carry no explicit label until rendered stay findable.

@electron-stagewright/plugin-native-ui drives that seam: native_menu (the full menu tree) and native_menu_item { path } (resolve one item by a label/role path, e.g. ["View","Dark Mode"]). The plugin keeps the orchestration (the gate, the path walk, error envelopes) in TypeScript; the transport owns the main-process read.

A seam — not eval — because the menu read runs a fixed serializer the transport owns, not agent-supplied JavaScript; it is a bounded capability like the storage cookie read, so it should not inherit the eval threat model or the --allow-eval opt-in.

2. Gated by canAccessNativeUI, NOT --allow-eval gated, a read-only non-secret surface

Rationale

Alternatives considered

Consequences

References

Status Update — 2026-06-19: menu invocation (read → read+act)

The native-UI seam, read-only at acceptance, gains a second method — TransportSession.invokeApplicationMenuItem(path) — driven by a new native_menu_invoke tool, so an agent can not only read the application menu but TRIGGER a menu action ("click File → Save") without simulating a keyboard accelerator. The decision holds the seam's shape:

Menu invocation flips the native-UI surface from read-only to read+act. Tray read/invoke and startup notification capture follow in later Status Updates below.

Status Update — 2026-06-19: notification capture (a native-event capture model)

The native-UI plugin gains a capture mechanism alongside the menu read/invoke seam: TransportSession gains startNotificationCapture(filter?) / capturedNotifications() / stopNotificationCapture() (plus the types NativeNotification / NotificationCaptureFilter), driven by three native_notifications_* tools, so an agent can assert "the app showed a Saved notification". Unlike the menu's one-shot read, this is an arm → read → stop capture model (like the network and IPC plugins): native.ALREADY_CAPTURING / native.NOT_CAPTURING gate the lifecycle.

Tray read/invoke and startup notification capture are covered by the Status Updates below.

Status Update — 2026-06-19: system-tray read (on launch-time instrumentation)

The native-UI plugin gains a tray read, built on the launch-time instrumentation foundation (ADR-020): TransportSession.getTrays() + a native_trays tool return the app's system-tray icons — each with its tooltip, title, whether it has an icon image (hasImage, never pixels), and its context menu (serialised with the same field set as the application menu). A stable per-tray id is included so a future context-menu invocation has a handle.

Status Update — 2026-06-20: tray event invocation (tray read → read+act)

The tray surface gains its act half, mirroring how menu read grew a menu invoke. The seam gains TransportSession.invokeTrayEvent(id, event) + a native_tray_invoke { id, event } tool (plugin 0.4.0 → 0.5.0): name a tray by the id from native_trays and fire a click / right-click / double-click (or a platform mouse-* / balloon-click) event so the app's own tray.on(event, …) handler runs — the deterministic way to drive tray behaviour without a real mouse. It rides the same launch-time instrumentation registry (ADR-020), which already holds the live Tray instance next to its serialised record, so the invoke is a pure addition: a fixed self-contained electronApp.evaluate body finds the tray by id, synthesizes the (event, bounds, position) arguments a real tray click carries (using the tray's own getBounds() when available), and emits on the live instance.

Status Update — 2026-06-20: notification capture at t=0 (startup notifications)

Notification capture was arm-then-observe: the hook patched Notification.prototype.show only AFTER launch, so a notification the app fires at startup (app.whenReady()) was gone before the agent could arm. This retrofit catches it, on the same launch-time instrumentation as the tray hook (ADR-020). When a session is launched with instrumentNative, the shim installs a second fixed hook (NOTIFICATION_HOOK_BODY) BEFORE the app's main runs, so every shown notification is buffered from t=0. The plugin API is unchanged — the agent still arms, drives, and reads — but on an instrumented session the read now includes the startup notifications too.