Skip to content

Architecture

docsxai is a small family of packages around one deterministic core. The engine owns the flow-file parser, the Playwright-backed runtime, the calibration aids (lint, diagnose, flow-tree, style), the plugin runtime, and the target-site auth strategies. Everything else orchestrates around it: the Claude Code plugin and the standalone MCP server are invocation surfaces over the engine, the backend persists doc packs, the viewer renders them. None of the satellites adds browser primitives of its own.

Calibration is AI-assisted and rare. A host agent - through the Claude Code plugin or the MCP server - drives discovery against the live app (that part is browxai’s surface), picks one canonical locator per step, and commits the result as a flow-file. The engine helps at write-time, not run-time: lint catches authoring mistakes statically, diagnose packages halt context into typed recommendations, flow-tree visualises the extends graph, and the actionable() probe says whether a selector is clickable before the step is ever written down.

Execution is deterministic and continuous. docsxai run replays the flow through headless Chromium with no agent and no MCP in the loop. The environment block (frozen clock, pinned locale, timezone, viewport, color scheme) makes the same flow against the same target state produce byte-identical screenshots; a keystone test enforces that against real Chromium on every change to the runtime.

The engine has no model-provider SDK anywhere in its dependency tree, and the project treats adding one as a contract violation. Calibration-time inference is supplied by whatever host agent you already run; execution-time inference does not exist. Two consequences worth internalising:

  • Halts are a feature. When a locator or success check fails, the run halts with a [cause: ...] prefix instead of asking a model to guess a fallback. Drift is a signal to recalibrate, not to absorb silently.
  • The cost story stays honest. A doc refresh costs one headless browser session, so running it per commit is no more exotic than running your Playwright suite.

The runtime is written against a thin BrowserDriver interface, not against Playwright directly: goto, click, fill, the wait primitives, the success-check reads, screenshot, boundingBox, and the write-time actionable(selector) probe. The one Playwright integration point (PlaywrightDriver) stays small and is the engine’s single Playwright import site. This seam is what lets browxai slot in as the model-agnostic discovery driver during calibration while execution keeps its own raw Playwright sessions - and it keeps the runtime testable without a browser at all.

PackageRole
engineParser, runtime, CLI, auth strategies, plugin runtime, exporters.
pluginClaude Code plugin: calibrate + diagnose skills, deterministic commands.
mcpStdio MCP server for any host: orchestration + doc-pack introspection.
backendDoc-pack persistence: revisions, blobs, OAuth 2.1, GitHub webhook.
viewerInteractive viewer, browser-free burn renderer, Starlight emitter.
plugin-confluencePublisher plugin: idempotent Confluence Cloud push.
plugin-starlightRenderer plugin: production Starlight docs site.

Every arrow in that table points inward: surfaces wrap the engine, the engine wraps BrowserDriver, and nothing on the execution path knows an agent exists.

Made by Kalebtec · GitHub · Apache-2.0 licensed