OpenClaw Architecture Proposal source snapshot 8b0eac79 / 2026-06-15
Implementation strategy proposal

Gateway-Independent OpenClaw Core

OpenClaw should keep conversations, crons, heartbeats, tasks, and local control working when every channel is disconnected or the channel Gateway is restarting. The public openclaw gateway experience stays intact; its internal ownership model changes.

0 config edits Required for normal upgrades
6 phases Logical separation before process split
2 plugin lanes Legacy co-located + connector v2
1 core owner Host is the single writer and executor
Executive decision

Does this roadmap make sense for the product?

Yes, when the goal is expressed as product resilience and local autonomy rather than process purity. The right product promise is simple: channel connectivity is optional infrastructure, not the condition under which OpenClaw can think, schedule, or manage its own state.

Proceed with a phased compatibility-first extraction.

Keep openclaw gateway, gateway.*, the current port, authentication model, Control UI URL, config shape, and existing plugins working. Internally, make a new Host the owner of core lifecycle and state. Move the channel edge behind explicit ingress and delivery contracts. Split processes only after the ownership boundary is proven in-process.

Product case

Why detaching the Gateway matters

The current Gateway is both the channel edge and the composition root for almost every always-on system. That makes a transport restart, channel fault, or Gateway deployment wider than it needs to be. Detachment turns channel connectivity into a replaceable edge while preserving a stable, local OpenClaw runtime.

01

Core work survives channel outages

Crons, heartbeats, active turns, task queues, and local conversations continue while Discord, Telegram, or another transport reconnects.

02

Local-first is a real product mode

The TUI and Control UI can talk to a stable Host even when no channel is configured, authenticated, enabled, or healthy.

03

Smaller failure blast radius

A malformed native event, channel SDK failure, or transport memory leak cannot take down the scheduler and agent runtime with it.

04

Smaller config-change blast radius

A port, TLS, plugin, or channel setting restarts only the runtime that owns it instead of interrupting unrelated core work.

05

Clearer security boundary

Transport credentials, channel SDKs, and untrusted native payloads can be isolated from agent state, schedules, and durable core data.

06

Explicit plugin capabilities

A detachable connector contract forces channel plugins to declare portable ingress and delivery behavior instead of relying on broad in-process helpers.

07

Better testing and ownership

Host-only proof becomes possible. Channel conformance can be tested separately. Core no longer needs Gateway RPC as an internal service locator.

08

Future deployment flexibility

Independent channel scaling or a remote edge becomes possible later, without making that operational complexity part of the initial product.

Why this is important: OpenClaw is an agent runtime with channel integrations, not a channel relay that happens to run agents. Its reliability boundary should reflect that product identity.
Target architecture

Separate ownership before separating processes

The architecture has four internal roles. The compatibility supervisor preserves today's public launcher. The Host owns core execution and state. The Control Server preserves the current protocol for UIs and clients. Channel Gateways own only transport concerns.

Current-to-target ownership shift logical split first / process split later
Current and target OpenClaw architecture The current monolithic Gateway owns clients, channels, agents, schedules, and state. The target compatibility supervisor contains a Host, Control Server, and detachable Channel Gateways. Today: one Gateway owns everything transport fault = core lifecycle fault Gateway composition root HTTP / WebSocket / auth / methods / lifecycle / shutdown Channels connect + route Agents turns + tools Control UI Gateway RPC Cron scheduled work Heartbeat wake + deliver State sessions + tasks Result: restarting channel infrastructure interrupts core systems wide lifecycle coupling / broad in-process plugin surface Target: compatibility supervisor with explicit owners same public command / smaller failure domains openclaw gateway - compatibility supervisor Host single owner of core execution + durable state Agents + sessions Cron + tasks Heartbeat Delivery intent Control Server current protocol + auth UI / TUI / CLI / nodes Channel Gateway(s) connect / parse / ack / send restartable transport edge Result: Host stays alive while connectors restart or remain absent same operator surface / explicit contracts / independent proof
Host

Owns agents, conversations, sessions, crons, heartbeats, tasks, routing policy, durable delivery intent, and canonical state. It must run correctly with zero Channel Gateways.

Owner Owns Must not own Failure behavior
Host Core lifecycle, agent turns, sessions, cron, heartbeat, tasks, policy, canonical state, durable delivery intent Native channel SDKs, reconnect loops, channel credentials, transport-specific payloads Continues with channels absent; graceful local control remains available
Control Server Existing Gateway protocol, authentication, subscriptions, UI/TUI/CLI/node access Core scheduling or channel transport lifecycle Restartable without cancelling Host work
Channel Gateway Connection, auth to channel, native parse/normalize, acknowledgements, send, typing, receipts, transport health Product commands, provider policy, agent state, cron, heartbeat scheduling Can restart independently; resumes from Host delivery intent
Compatibility supervisor Current launcher, process supervision, lifecycle ordering, health aggregation, legacy plugin placement Business logic or durable state Preserves the current operator experience and hides internal topology
Concrete value proof

Config changes should restart owners, not OpenClaw

The current system already validates, diffs, and hot-applies most configuration changes. The architectural problem is the fallback boundary: when a change is startup-bound or lacks a safe scoped reload contract, the planner restarts the one Gateway process that also owns agents, cron, heartbeat, channels, plugins, and control connections.

Today: process-wide fallback

  • The config watcher validates the complete file, diffs changed paths, and builds a path-based reload plan.
  • Dynamic paths use the active in-memory snapshot; hot paths selectively rebuild cron, heartbeat, hooks, plugins, MCP runtimes, or channels.
  • Startup-bound, explicitly restart-required, and unknown paths set one coarse result: restartGateway: true.
  • Because the Gateway is the composition root, that fallback closes channels, agent harnesses, plugin services, cron, heartbeat, and control connections together.

Target: owner-scoped convergence

  • A Host-owned Config Coordinator validates and resolves the complete config once, then publishes a versioned desired snapshot.
  • The planner maps each changed path to its real owner: Host subsystem, Control Server, one or more connectors, or supervisor.
  • Only affected owners drain, hot-apply, rebuild a subsystem, or restart. Every unaffected owner continues on its active revision.
  • Health exposes desired and active revision per owner so failed or incomplete convergence is visible and retryable.
The product value: changing the Control Server port or TLS no longer interrupts a cron job or active agent turn. Rotating one Telegram token no longer requires restarting unrelated channels or rebuilding core runtime state. Decoupling turns configuration from a service-wide interruption risk into a bounded owner lifecycle operation.
Changed path Current hybrid behavior Proposed hybrid behavior Unaffected work
messages.*, routing.* Swap the Gateway process runtime snapshot Swap the Host snapshot Control Server and connectors continue
cron.* Stop and rebuild Gateway-owned cron in-process Restart the Host scheduler subsystem Active turns, Control Server, and connectors continue
channels.telegram.* Restart Telegram inside the monolithic Gateway Restart only the affected Telegram connector or account Host, Control Server, and other connectors continue
gateway.port, bind, TLS, HTTP Restart the complete Gateway process Restart only the Control Server Agents, cron, heartbeat, tasks, and connectors continue
Host-owned plugin/runtime setting Restart the complete Gateway process Restart the Host or affected Host subsystem only Control Server and detachable connectors continue or reconnect
Unknown or ambiguous path Fail-safe full Gateway restart Fail closed; use combined-service restart during migration, require owner metadata before detached default No silent partial application
current planner shapecoarse fallback
type GatewayReloadPlan = {
  changedPaths: string[];
  restartGateway: boolean;
  restartCron: boolean;
  restartHeartbeat: boolean;
  restartChannels: Set<ChannelId>;
  reloadPlugins: boolean;
};

// Any restart-required path ultimately restarts
// the process that owns every runtime.
target planner shapeowner scoped
type ConfigApplyPlan = {
  desiredRevision: ConfigRevision;
  actions: Array<{
    owner: RuntimeOwner;
    paths: string[];
    mode: "dynamic" | "hot" |
      "restart-subsystem" | "restart-process";
  }>;
};

// Every owner reports its active revision.
// Unaffected owners do not restart.

Required config apply contract

  1. Validate the complete source config and resolve required secrets before applying any revision.
  2. Diff source-authored paths and map every changed path to an explicit runtime owner.
  3. Prepare affected owners and drain only work that those owner actions can interrupt.
  4. Persist one desired revision, apply owner actions, and expose desired versus active revision.
  5. Retry or roll back failed owners without silently activating a half-written config.

Compatibility rules

  • gateway.reload.mode="hybrid" gains owner-scoped actions by default.
  • "hot", "off", and explicit "restart" retain their shipped operator semantics; explicit restart remains a full compatibility-supervised service restart.
  • Existing v1 plugin restartPrefixes retain their current safe co-located restart meaning.
  • Connector v2 and modern Host plugins can declare owner-scoped reload and restart capabilities additively.
  • Cross-process application is revisioned convergence, not an unprovable promise of instantaneous atomic cutover.
Compatibility strategy

Backward compatibility is a release gate, not a cleanup task

Users should not need to understand the new topology. Existing configuration, commands, URLs, ports, auth, state, and plugin packages continue to work. The compatibility layer belongs at the public edge and plugin boundary, never as duplicate core execution paths.

Operator surface unchanged

  • openclaw gateway remains the normal command.
  • The same port, auth, health surface, Control UI URL, and protocol remain available.
  • Users do not manage internal IPC, process roles, or new mandatory services.

Configuration zero-touch

  • Existing openclaw.json stays valid.
  • No required config key additions, renames, or environment variables.
  • Normal upgrade must work without openclaw doctor --fix.

State and behavior preserved

  • Sessions, schedules, delivery behavior, and routing semantics remain stable.
  • One canonical Host path prevents double execution and duplicate delivery.
  • Existing local and channel workflows keep the same user-visible behavior.
Phased roadmap

Ship user value before changing topology

A one-shot rewrite would combine lifecycle, protocol, plugin SDK, state ownership, channel migration, and upgrade risk in one release. The safer route is a sequence of vertical slices where each phase leaves one canonical path and produces independently measurable product value.

Logical split first Establish ownership and typed boundaries in one process before introducing IPC or supervision.
Canonical path per slice Move each subsystem fully to the Host. Avoid long-lived dual execution, dual writes, or runtime fallback stacks.
Compatibility at public edges Keep current commands and plugin behavior through named adapters and facade surfaces, not by retaining two cores.
Proof-gated rollout Advance only after host-only, restart, upgrade, and legacy-plugin scenarios prove the new boundary.
6 phases shown
0
Contract and guardrails

Freeze product invariants and define owners

Document the target ownership model, public compatibility contract, lifecycle ordering, and failure semantics before moving runtime code.

  • Add an ADR and owner matrix for Host, Control Server, Channel Gateway, and supervisor.
  • Add architecture checks that prevent new core dependencies on Gateway request context and channel-native runtime helpers.
  • Define stable internal service interfaces, lifecycle states, health snapshots, stop reasons, and config-path ownership metadata.
  • Specify owner-scoped config apply plans, revision convergence, and the compatibility meaning of every existing reload mode.
  • Add baseline zero-touch upgrade, legacy plugin, and host-with-channels-disabled scenarios.
Exit gate

Every current Gateway-owned subsystem and restart-required config path has one named future owner. Existing public surfaces, reload modes, and plugin lanes have explicit compatibility treatment.

approve now
1
Host kernel

Extract core lifecycle in-process

Create the Host as the only owner of core service startup and shutdown. Move the most separable services first: cron, then heartbeat after its delivery dependency is made explicit.

  • Introduce OpenClawHost as a narrow composition root, initially created by the current Gateway launcher.
  • Move config validation, diffing, and owner-scoped planning into a Host-owned Config Coordinator while execution remains combined in-process.
  • Move cron lifecycle and state access into the Host; prove it runs with channels disabled.
  • Replace heartbeat access to active channel plugins with a delivery port owned by the Host.
  • Remove Gateway shutdown ownership for migrated services in the same PR slices.
Exit gate

Cron and heartbeat continue through a simulated channel-manager restart. Config planning names affected owners without changing shipped reload behavior. No migrated service has two lifecycle owners.

approve now
2
Core service APIs

Make local conversations first-class Host clients

Replace internal Gateway RPC and the embedded Gateway stub with typed Host services for conversations, sessions, schedules, tasks, and delivery intent.

  • Define local service calls and event subscriptions independent of WebSocket transport.
  • Move the embedded TUI backend onto Host services and remove duplicated turn orchestration.
  • Migrate agent tools that currently call Gateway methods to injected Host capabilities.
  • Keep current CLI/TUI behavior and command names while eliminating Gateway as an internal API locator.
Exit gate

TUI conversations, session history, cron control, and agent tools work with the Host running and no Control Server or channels.

approve now
3
Control Server facade

Preserve the Gateway protocol while moving execution behind it

Split the mixed Gateway request context into transport/auth concerns and Host service calls. Keep existing methods, subscriptions, authentication, and URLs stable.

  • Extract a Control Server that serves the current Gateway protocol and delegates to Host services.
  • Preserve gateway.*, current WebSocket events, Control UI behavior, CLI access, nodes, and hooks.
  • Move Control UI dependency from "Gateway owns everything" to "Control Server exposes Host and connector status."
  • Assign port, bind, auth, TLS, HTTP, and Control UI config paths to the Control Server lifecycle.
  • Prove Control Server restart does not interrupt active Host work.
Exit gate

The current UI, TUI remote mode, CLI, and nodes pass unchanged protocol tests. A Control Server-owned config change causes no active-turn or scheduler interruption.

approve now
4
Connector v2

Introduce detachable channel contracts and migrate vertically

Add a serializable connector contract while preserving all v1 channel plugins through automatic co-located execution. Migrate bundled channels one at a time.

  • Define ingress, delivery, receipt, health, capability, and lifecycle envelopes.
  • Make the Host the durable owner of delivery intent, retry policy, dedupe keys, and routing decisions.
  • Start with a low-risk bundled channel, then migrate higher-volume channels using the same conformance suite.
  • Add diagnostics that identify v1 co-located plugins and v2 detachable connectors without requiring action.
Go/no-go gate

A migrated connector can restart during live traffic with zero lost or duplicate deliveries, while an unchanged external v1 fixture still works.

proof gated
5
Supervised process split

Enable physical fault isolation behind the same command

After logical ownership and connector contracts are stable, let the compatibility supervisor run the Host, Control Server, and detachable connectors as separate local processes.

  • Add private local IPC, leases, health aggregation, restart budgets, and clean shutdown ordering.
  • Execute owner-scoped config plans across processes and expose desired versus active config revision per owner.
  • Keep SQLite ownership single-writer and keep all internal endpoints private and auto-managed.
  • Roll out by release cohort with automatic fallback to the combined topology only as an explicit release rollback, not steady-state runtime behavior.
  • Defer remote connectors and independent scaling until local product value is measured.
Go/no-go gate

Fault injection and restart-required config edits prove connector and Control Server restarts do not interrupt Host work. Startup cost and resource use remain inside agreed budgets.

proof gated
Execution plan

PR-sized work packages

The sequence below keeps each change reviewable and reversible at the release level. Bundled callers migrate in the same change that introduces a modern API; compatibility remains narrowly scoped to shipped public contracts and external plugins.

ID Work package Primary surfaces Required proof
P0Architecture ADR, public invariants, owner matrix, lifecycle state modeldocs/, architecture checksOwner review; no runtime change
P1Dependency guardrails: block new core use of Gateway request context and channel-native runtimesrc/gateway/, src/agents/, lint/architecture testsCurrent main remains green; intentional exceptions enumerated
P2Create in-process Host lifecycle shell and health snapshotsrc/host/, current Gateway launcherStart/stop ordering, repeated start rejection, clean shutdown
P2AExtract Config Coordinator and owner-scoped apply-plan metadata while retaining current combined executionsrc/gateway/config-reload*, runtime snapshot, Host lifecycleCurrent reload behavior unchanged; every restart-required path resolves to an owner or explicit compatibility fallback
P3Move cron lifecycle and canonical state access to Hostsrc/cron/, src/gateway/server-cron.tsHost-only cron execution; current Gateway cron RPC parity
P4Add Host delivery-intent port and move heartbeat lifecyclesrc/infra/heartbeat-*, delivery boundaryHeartbeat schedules without channels; delivery queues until connector returns
P5Introduce conversation, session, schedule, and task Host servicessrc/agents/, session/state ownersTyped service contract tests and deterministic event ordering
P6Move embedded TUI to Host servicessrc/tui/embedded-backend.ts, src/tui/tui-backend.tsLocal TUI turn and history with Gateway disabled
P7Migrate agent tools away from internal Gateway RPC; delete embedded Gateway stubsrc/agents/openclaw-tools.ts, src/agents/tools/Agent tool parity; no internal Gateway call path remains
P8Split request context and extract Control Server facade with owned config lifecyclesrc/gateway/server-request-context.ts, methods, protocol, server configExisting protocol/auth behavior unchanged; Control Server config restart does not interrupt Host work
P9Route Control UI, CLI, remote TUI, nodes, and hooks through Control Serverui/, gateway clients, node APIsCurrent user workflows pass without config edits
P10Define connector v2 protocol and conformance kitsrc/plugin-sdk/, channel contracts, docsSerializable schema round trips; version/capability negotiation
P11Implement legacy v1 co-location classifier and diagnosticsplugin loader, compat registry, doctor diagnosticsUnchanged external fixture loads; no user action required
P12Migrate first bundled connector as a complete vertical sliceone bundled channel, Host ingress/delivery servicesCrash/restart, dedupe, order, receipts, multi-account proof
P13Migrate remaining bundled connectors incrementallybundled channel pluginsPer-channel conformance plus real channel proof where feasible
P14Add supervisor, private IPC, leases, restart budgets, config revision convergence, and health aggregationlauncher, Host/Control/connector process entrypointsFault injection; SQLite single-writer; bounded restart loops; desired/active revision visibility
P15Add zero-touch published-upgrade lane and cohort rollout controlsupgrade tests, release checks, telemetry/diagnosticsLast stable to new release with no doctor and no config edit
Acceptance and rollout

Prove independence as user-visible behavior

Green unit tests are not enough for this change. The acceptance suite must deliberately remove, restart, and corrupt channel-side infrastructure while proving that core work continues and existing installations upgrade without intervention.

Scenario Expected result Phase gate
Host only, no channels configuredLocal conversation, sessions, cron, heartbeat scheduling, and tasks workP1-P2
All channels disabledHost remains healthy; delivery intent is explicit rather than silently lostP1-P2
Connector crashes during active turnTurn completes; delivery resumes idempotently after connector restartP4-P5
Control Server restartsActive Host work and channel connections continue; clients reconnectP3
Control Server-owned config changeChanging port, bind, TLS, HTTP, or Control UI settings restarts only Control Server; Host work continuesP3-P5
Connector-owned config changeChanging one channel account or credential restarts only the affected connector/accountP4-P5
Mixed-owner config changeComplete config validates once; every owner reports desired/active revision; failure is visible with no silent partial activationP5
Existing explicit restart modegateway.reload.mode="restart" retains full compatibility-supervised service restart semanticsrelease
Legacy external v1 channel pluginLoads unchanged in automatic co-located mode with clear diagnosticsP4-P5
Connector v2 pluginRuns detached using only declared serializable contractsP4-P5
Last stable published upgradeStarts successfully with existing config/state/plugins, no doctor, no editsrelease
Multi-account concurrent trafficRouting, ordering, dedupe, and delivery receipts remain correctP4-P5
Process fault injectionLeases prevent dual writers; restart budgets prevent crash loopsP5
Config reload and shutdownOwner-scoped lifecycle ordering is deterministic; no orphaned work, duplicate execution, or hidden revision driftP5

Rollout metrics

  • Active core tasks interrupted by connector restart: target 0.
  • Duplicate deliveries caused by restart/retry: target 0.
  • Unrelated core work interrupted by Control Server or connector config change: target 0.
  • Normal upgrades requiring config edits or doctor: target 0.
  • Legacy external plugin load success: target no regression.
  • Connector recovery time and queued delivery age: measured per channel.
  • Startup time, RSS, CPU, and file descriptor cost: bounded before default process split.

Release strategy

  • Phases 0-3 ship as internal ownership changes behind the current launcher.
  • Connector v2 ships additively with one bundled vertical slice and legacy compatibility.
  • Physical split begins opt-in/internal, then small release cohorts, then default only after fault proof.
  • Rollback changes topology at release startup; it does not add a permanent dual-runtime fallback.
  • Remote connectors remain out of scope until local resilience value is demonstrated.
Risk register

The hard parts are ownership and compatibility, not IPC

Physical separation is straightforward only after delivery, lifecycle, state, and plugin contracts are unambiguous. The roadmap treats every ambiguity as a rollout blocker because those are the places that create duplicate work, lost messages, or surprise plugin failures.

Risk Severity Mitigation Verification
Dual execution or duplicate deliverycriticalHost is the single owner of intent, dedupe keys, retry policy, and completion state; leases prevent dual active owners.Kill/restart fault injection under concurrent traffic; assert exactly-once product outcomes where supported.
External plugin incompatibilitycriticalAutomatic legacy co-location, additive connector v2, named compatibility record, fixtures from real external plugin shapes.Published-plugin compatibility matrix and unchanged v1 fixture in every release lane.
Lifecycle globals and hidden couplinghighMove services behind Host lifecycle; delete module-global ownership as each subsystem migrates; make state transitions closed and observable.Repeated start/stop, partial startup failure, config reload, and shutdown ordering tests.
SQLite multi-writer corruption or lock contentioncriticalHost remains the single writer for core state; connectors exchange envelopes and never open Host-owned stores.Process concurrency test, lease loss test, and database integrity checks.
Protocol complexity and version drifthighVersion private connector protocol additively, keep schemas small, carry prepared facts, and use conformance tests.Round-trip schema tests, old/new connector negotiation fixtures, deterministic payload snapshots.
Config revision split-braincriticalValidate once, persist one desired revision, require explicit path ownership, expose active revision per owner, and fail closed on ambiguous or failed convergence.Mixed-owner config fault injection, owner restart during apply, rollback/retry tests, and health assertions for desired versus active revision.
Authentication or trust boundary regressionhighControl Server preserves existing auth; internal IPC is private, authenticated, and never user-configurable in initial rollout.Auth parity tests, local privilege boundary review, hostile envelope validation.
Operator complexityhighOne command, one health story, one log correlation ID, and supervisor-owned diagnostics; topology stays hidden by default.Fresh install and upgrade smoke from an operator perspective.
Resource and startup regressionmediumDelay process split until measured; set explicit budgets; group low-volume connectors if isolation value does not justify cost.Before/after startup, RSS, CPU, and file descriptor benchmarks.
Decisions and boundaries

What to decide now, and what to defer

The proposal intentionally leaves implementation choices that do not affect the first product outcome open. Approve the ownership model and compatibility contract now; choose private transport and advanced deployment options only when the relevant phase is ready.

Decisions to approve now

  • The Host owns all core execution, durable state, and delivery intent.
  • A Host-owned Config Coordinator validates the complete config and produces owner-scoped apply plans with visible revision convergence.
  • openclaw gateway remains the compatibility supervisor and public operator surface.
  • Control Server preserves the current Gateway protocol and auth model.
  • Legacy v1 external channel plugins run automatically in co-located compatibility mode.
  • Connector v2 is additive, serializable, capability-declared, and detachable.
  • Existing reload-mode and v1 plugin restart semantics remain compatible while default hybrid reload gains smaller internal restart boundaries.
  • Normal upgrades require no config edits and no doctor migration.

Explicit non-goals

  • No public rename of Gateway, commands, config, protocol, or UI in this roadmap.
  • No user-managed IPC, extra ports, or manual process orchestration.
  • No forced external plugin migration or immediate v1 deprecation.
  • No remote Channel Gateway, multi-host deployment, or independent scaling in initial phases.
  • No all-channels-at-once migration.
  • No steady-state dual core, dual write, or silent runtime fallback path.
Should legacy co-located mode ever be removed?

Do not decide removal in this roadmap. Treat the shipped plugin SDK as a public contract. First ship connector v2, publish migration guidance, measure adoption, and collect concrete maintenance or security reasons. Any removal would require a separate owner-approved deprecation window and major-release decision.

Which bundled channel should migrate first?

Choose a channel with a complete test harness, low operational blast radius, and representative ingress/delivery behavior. The first migration should validate the contract, not prove that the hardest channel can be rewritten. Follow with a high-volume channel only after conformance and fault proof are stable.

What private IPC should the process split use?

Defer the choice until phase 5. The logical API must not be designed around a transport. Evaluate local Unix domain sockets or an equivalent cross-platform private transport against authentication, lifecycle, backpressure, observability, and Windows support. It must remain an internal implementation detail.

What do plugin gateway_start and gateway_stop hooks mean after the split?

Preserve their shipped behavior as lifecycle hooks for the overall OpenClaw service under the compatibility supervisor. Add narrower modern lifecycle contracts for Host services and detachable connectors rather than silently changing the meaning of existing hooks.

When should remote connectors become a product feature?

Only after local detachment demonstrates measurable reliability value and there is concrete operator demand. Remote connectivity introduces a public trust, networking, deployment, and support contract that should not be bundled into the core resilience refactor.

Current-main evidence

Why this plan fits the existing codebase

The recommendation is based on current upstream main at 8b0eac7927d5e7695d058d3503edcd3f8e278b67. The current code already contains separable service contracts and a local TUI path, but the Gateway still composes and shuts down the core runtime, and current channel plugin types are too broad and function-valued to cross a process boundary.

Finding Current source evidence Roadmap implication
Gateway is the monolithic composition rootsrc/gateway/server.impl.ts:548, src/gateway/server.impl.ts:953, src/gateway/server.impl.ts:1033, src/gateway/server.impl.ts:1567Create Host lifecycle and move service ownership before process work.
Current config reload already plans by path but has a process-wide restart fallbacksrc/gateway/config-reload.ts:117, src/gateway/config-reload-plan.ts:56, src/gateway/config-reload-plan.ts:324, src/gateway/server-reload-handlers.ts:543Preserve validation and path planning; replace the coarse restart result with owner-scoped actions.
A full Gateway close stops core and edge systems togethersrc/gateway/server-close.ts:797, src/gateway/server-close.ts:802, src/gateway/server-close.ts:832Config restart blast radius is direct evidence for separating runtime owners.
Cron already has a separable service contractsrc/cron/service-contract.ts:22, src/cron/service.ts:14, src/gateway/server-cron.ts:130Use cron as the first Host-owned vertical slice.
Heartbeat has a lifecycle but reaches channel pluginssrc/infra/heartbeat-runner.ts:150, src/infra/heartbeat-runner.ts:243, src/infra/heartbeat-runner.ts:2127Add a Host-owned delivery port before moving lifecycle.
Local TUI proves conversations can work without Gatewaysrc/tui/tui-backend.ts:130, src/tui/embedded-backend.ts:358, src/tui/embedded-backend.ts:1041Unify local conversation flow on Host services and delete duplicated orchestration.
Agent internals use Gateway as a capability APIsrc/agents/openclaw-tools.ts:385, src/agents/tools/embedded-gateway-stub.ts:211Replace internal Gateway RPC with injected Host capabilities.
Control UI is fully Gateway-boundui/src/ui/gateway.ts:473, ui/src/ui/app-gateway.ts:767, ui/src/ui/controllers/cron.ts:194Preserve the current protocol through a Control Server facade.
Gateway request context mixes domainssrc/gateway/server-methods.ts:260, src/gateway/server-request-context.ts:15, src/gateway/server-methods/chat.tsSplit transport/auth context from Host service dependencies.
Current channel plugin runtime is broad and in-processsrc/gateway/server-channels.ts:172, src/channels/plugins/types.adapters.ts:244, src/plugins/types.ts:2610Keep v1 co-located; add a narrow serializable connector v2.
Current turn types contain functions and callbackssrc/channels/turn/types.ts:258, src/channels/turn/types.ts:303, src/channels/turn/types.ts:371Do not attempt to send current assembled turns over IPC; define portable envelopes.
Repo already has a named plugin compatibility policydocs/plugins/sdk-migration.md:204, docs/plugins/compatibility.md:10, src/plugins/compat/registry.tsUse an explicit compatibility record and additive API before any deprecation.
Existing upgrade proof invokes doctordocs/reference/test.md:58Add a new last-stable zero-touch upgrade lane that does not run doctor.
Product docs define Gateway as always-on control planedocs/gateway/index.md:73, docs/web/tui.md:37, docs/web/control-ui.md:10Keep the public name while changing internal ownership; update docs only as behavior ships.
Visual provenance: This proposal uses the restrained dark operational style and color tokens from OpenClaw Control UI, especially ui/src/styles/base.css, ui/src/styles/components.css, ui/src/styles/layout.css, and ui/src/styles/activity.css. It is a standalone artifact with all CSS and interaction logic inline.
Recommendation copied
✦ Made with pagedrop.ai