Electron Stagewright docs

ADR-011: Operation-timeout backstop at the dispatch boundary

Status: Accepted

Context

The server drives a real app over a transport. Most per-tool operations are already bounded — interaction actions clamp a Playwright timeoutMs (≤ 30s), the wait family self-bounds its poll in the renderer (≤ 60s), and eval surfaces a transport timeout as EVAL_TIMEOUT. But there is one unbounded class: a transport call that simply never settles. A frozen renderer makes page.evaluate (the basis of snapshot / find / read / expect) wait indefinitely — Playwright's evaluate has no implicit timeout. A tool whose handler awaits such a call hangs the dispatch forever, and the agent is stranded with no envelope and no recovery.

The resilience/chaos review surfaced this as a real gap (a "hung app" has no bound) and explicitly deferred it pending a policy decision rather than faking a fix.

Decision

Add a dispatch-level backstop timeout: the dispatcher races each handler against a configurable budget. If the handler does not settle within the budget, the dispatch resolves with a registered, retryable OPERATION_TIMEOUT envelope (details.timeout_ms carries the budget) instead of hanging.

Rationale

Alternatives considered

Consequences

References