Branch deploys (proposal)
Status: proposal. Not yet architecture. Not yet implemented. Code should not be written against this document. The Shared agent resources document is the currently-binding answer for “agents need a database”; branch deploys is a different, larger idea that we may or may not choose to build.
This document captures the design work done during early scoping so the thinking is not lost.
The question
Section titled “The question”Given that agents produce application code, and given that the cluster x1agent runs in is a Kubernetes cluster, should the platform support deploying the apps agents build back into that same cluster?
Put differently: can a coding agent close the loop — write the app, ship the app, observe the app in production, fix the app — all within one cluster, without routing through an external CI/CD system?
What makes this a proposal and not architecture
Section titled “What makes this a proposal and not architecture”Three reasons this is parked rather than adopted:
- Scope. Shipping this requires a build pipeline (already planned for the image catalog), an Ingress + DNS + TLS story (new), a workload reconciliation loop (new), a
deploy_branchpermission model (extends the existing permission-grants system), and UI surfaces. The surface is large enough that it deserves its own sequencing decision rather than being slipped into the shared-agent-resources scope. - Use-case demand. The platform’s two initial verticals (coding automation, agent automation for external systems) do not require agent-shipped workloads to prove value. Shared agent resources are table-stakes for coding work; branch deploys are an amplifier that becomes valuable once coding automation is working well.
- Operator burden. A production deployment enabling branch deploys needs a wildcard DNS record, a wildcard TLS certificate, and an Ingress controller configured for per-branch hostnames. Shipping without those being optional raises the floor for “default out-of-the-box install” in a way that conflicts with the CLAUDE.md principle that the default path must work with zero operator setup.
If and when this proposal is promoted, it will either ship behind an explicit opt-in (branchDeploys.enabled: true in Helm, off by default) or be split into its own optional chart.
The idea
Section titled “The idea”Every branch in an agent’s repo can become a live running Deployment in the workspace namespace, with its own URL, its own database, and its own lifecycle tied to the upstream branch.
workspace namespace ws-<id> | postgres StatefulSet <- shared agent resource | redis StatefulSet <- shared agent resource | app "api" (branch main) <- user-authored workload; always up | app "api" (branch feat-x)<- ephemeral preview; reaped on branch delete | app "api" (branch feat-y)<- ephemeral preview; reaped on branch deleteEach branch deploy:
- Its own
Deployment+Service, labeledx1.app=api, x1.branch=feat-x. - In-cluster DNS:
api-feat-x.ws-<id>.svc.cluster.local. - External URL via a wildcard Ingress pattern:
feat-x.api.<workspace-slug>.<cluster-domain>(requires operator to configure wildcard DNS + TLS). - Uses the branch’s Postgres database from shared agent resources (same
DATABASE_URLthe agent saw during development; no separate prod DB). - Same workspace secrets as the development session.
The agent-facing surface
Section titled “The agent-facing surface”A new MCP tool, permission-gated behind a deploy grant:
deploy_branch( app: string, // workspace-unique app name branch: string, // from the agent's working repo commit_sha: string, // pinned for reproducibility dockerfile: string, // repo-relative path ports: number[], env?: Record<string, string>, // ${SECRET} refs allowed replicas?: number, // default 1 resources?: { cpu, memory }) -> { url, version, image_ref } | { denied, reason }On approve:
- Kaniko Job builds the Dockerfile against the agent’s repo at
commit_sha, pushes to the in-cluster registry asws/<id>/app-<name>:<commit_sha>. - API creates or updates the Deployment for
(app, branch)with the new image. - API creates a Service + Ingress if they do not yet exist.
- DSN returned to the agent includes the external URL and the image reference.
The agent can hand the URL back to its parent (an orchestrator) or to the user. Reviewers click the URL to see the change before merging the PR.
Use cases this unlocks
Section titled “Use cases this unlocks”- Preview environments per PR. Human reviewer clicks a link to see the agent’s change running live, not just a diff.
- Parallel hypotheses. An orchestrator spawns three worker agents, each on its own branch. Each produces a running URL. The orchestrator (or the user) visits each, picks the winner, merges. Other branches auto-reap.
- Continuous self-deployment. A dark-factory agent working on an internal tool pushes to main, triggers a deploy, watches logs, iterates. No human in the merge-to-deploy loop for agents with persistent
deploygrants.
Where the hard parts live
Section titled “Where the hard parts live”- Dev/prod data collapse. The branch’s Postgres database IS the branch’s Postgres database, whether the consumer is a dev session or a running branch-deploy pod. This is intentional (agents see one coherent state) but worth naming: a deploy of branch
feat-xruns against the same rows the lastfeat-xdev session wrote. Main branch is live; every feature branch is its own sandbox. - Secrets in deployed pods. Same workspace secret store.
valueFrom.secretKeyRefin the Deployment. No new primitive. - Rollback. Version rows per
(app, branch); flipcurrent_version_id, re-apply. Built image is still in the registry; no rebuild. - Reaping. Daily sweep against GitHub’s branch list. Branch gone upstream -> delete Deployment, Service, Ingress, version rows, artifacts.
- Resource budgets. Every branch deploy counts against workspace budget. Preview proliferation is a real risk; mitigate with per-workspace “max concurrent branch deploys” setting and a reaper that ages out preview Deployments with no HTTP traffic for N days.
Where the operator burden lives
Section titled “Where the operator burden lives”- Ingress controller. Operator runs one (nginx, Traefik, Istio gateway). x1agent emits plain Ingress resources; whichever controller is installed serves them.
- Wildcard DNS. Operator configures
*.<workspace-slug>.<cluster-domain>to point at the Ingress controller, or delegates that responsibility toexternal-dnswith a ZonedRecordSet. Per-workspace subdomains cap the DNS sprawl. - Wildcard TLS.
cert-managerwith a wildcard DNS-01 challenge, one certificate per workspace subdomain. ACME account needed.
None of these are exotic; they are standard K8s infrastructure. But they are three services the operator must configure before the feature works, which is why this proposal bundles an explicit opt-in.
Data model (sketch)
Section titled “Data model (sketch)”workspace_apps ( id, workspace_id, name, default_branch, created_by, created_at);
workspace_app_branches ( id, app_id, branch_name, current_version_id, deployment_ref, service_ref, ingress_ref, url, status, created_at, reaped_at);
workspace_app_versions ( id, branch_id, commit_sha, image_ref, status, built_at, deployed_at);
workspace_app_deploy_grants ( id, app_id, agent_id, grant_type='deploy', -- same permission system as spawn/request_service branch_pattern TEXT, -- e.g. 'feat/*' or '*' granted_by, granted_at, revoked_at);Companion requirements if promoted
Section titled “Companion requirements if promoted”- In-cluster registry — required for hosting built app images. Already planned for Phase 2 of the image catalog work but would be promoted to “ship this first” if branch deploys is prioritized.
- Kaniko build pipeline — required for building app images from agent repos. Same module needed by the image catalog.
- Permission-grants extension — add
deploygrant type, UI for granting/revoking per agent + branch pattern. - Workload reconciler — K8s watcher that maps branch-deploy rows to Deployment/Service/Ingress state, heals drift, reaps on branch deletion.
- Session UI updates — the agent’s session page should surface branch-deploy URLs in the event stream.
Open questions that would need resolution before promotion
Section titled “Open questions that would need resolution before promotion”- Multi-repo agents. One app per repo, or one app across many repos? Currently an agent can attach multiple repos; how does
deploy_branch(app='api', ...)pick which repo’s branch? - Deploy triggers. Agent-driven only (the agent calls
deploy_branchafter finishing work), or also webhook-driven (GitHub push event -> auto-redeploy)? The former is dark-factory-native; the latter is more conventional and requires webhook plumbing. - Promotion between branches. Is
mainjust “another branch” or does promotingfeat-xto main mean something special (DB schema migration plan, data copy)? - Cross-workspace linking. Can an agent in workspace A hit a branch deploy URL in workspace B (e.g. for inter-service testing)? Defaulting to “no” keeps workspace isolation tight; allowing it via an explicit workspace trust edge is a whole other design.
- Observability. Deployed apps emit logs and metrics. Do those route through the same NATS event stream sessions use, or through a separate metrics pipeline? Prometheus scraping the workspace namespace works but changes the platform’s monitoring story.
Adjacent proposals
Section titled “Adjacent proposals”- Build pipeline as a provider — whether Kaniko is the one right answer or whether the build step should be a port with multiple adapters (Kaniko, Buildkit-rootless, Buildah). Parked; adjacent to branch deploys but independently decidable.
- Workspace apps as long-running stateful services — a cousin of this proposal for when the “app” is something like a MinIO that should be always-on, not branch-scoped.
Summary
Section titled “Summary”Branch deploys are the dark-factory flywheel: agents produce running applications, not just code. The design maps cleanly onto existing x1agent primitives (workspace namespace, permission grants, in-cluster registry, credential injection). The parked status reflects scope and operator-burden concerns, not technical doubt. If and when this is promoted, the doc moves to architecture/branch-deploys.md and the proposal language is stripped.