Shared agent resources
Agents write code that talks to databases. Agents iterate on that code across many sessions. If every session starts with an empty Postgres, agents re-run migrations on every turn and real work is impossible.
Shared agent resources are long-running workspace-scoped services — Postgres, Redis, and (over time) any other stateful engine an admin opts into — that agents in the workspace connect to. They survive session teardown. They are isolated per branch so two agents working on two branches never step on each other. They are installed from a catalog surfaced in the workspace settings UI.
They are not the control-plane database. That distinction is the first section of this document, by design.
Companion docs:
- Sessions — how session pods are spawned and how env is injected.
- Permission grants — the approval model this flow reuses at install time.
- MCP servers — the workspace secret store (same
${NAME}syntax as env refs here). - Siblings — the in-pod ephemeral-service pattern, distinct from shared agent resources. See Siblings vs. shared agent resources.
Boundary with the control-plane database
Section titled “Boundary with the control-plane database”The cluster has two classes of Postgres. Never conflate them.
| Control-plane database | Shared agent resources | |
|---|---|---|
| What it holds | workspaces, agents, sessions, permission_grants, audit_events, workspace_secrets metadata | Tables agents create for the apps they are building |
| Who operates it | Whoever operates x1agent | Workspace admin, opted in per workspace |
| Namespace | x1agent | ws-<workspace-id> |
| Who sees it in UI | Nobody — operator concern | Workspace admins, under Settings -> Shared agent resources |
| Who connects to it | API, job-watcher, audit subscriber | Agent session pods only |
| Lifecycle | Lives with the cluster | Installed when an admin turns it on; reaped when the workspace is deleted |
The isolation rules
Section titled “The isolation rules”Two hard rules. Not principles. Not best practices. Rules.
Rule 1: Agent pods cannot reach the control-plane database.
Enforced by three layers, each of which would independently be sufficient:
- NetworkPolicy. Session pods in
ws-<id>are denied egress topostgres.x1agent.svc.cluster.local:5432. - Architecture. The agent container holds no API token and has no network path to
/api/internal/*. The sidecar holdsX-Internal-Tokenand calls only scoped internal routes; no internal route exposes control-plane SQL or row access. - Credential separation. The control-plane Postgres role does not exist in any workspace Postgres and vice versa. Different instances, different role catalogs.
Rule 2: Inside its branch database, an agent is owner and the platform does not police what it does.
The scoped role granted to an agent has:
LOGINOWNERof exactly<repo>_<branch>, and nothing else- No
CREATEDB, noCREATEROLE, noSUPERUSER
Inside that one database the agent can create tables, drop them, install schema-level extensions, load fixtures, and corrupt its own state at will. The platform does not inspect queries, freeze migrations, or otherwise second-guess agent behavior. If an agent nukes its schema the recovery path is POST /api/workspaces/:slug/shared-agent-resources/postgres/branches/:name/reset, which drops and re-creates from the main template. Branch data is scratch data. Ephemeral in principle, durable in practice; the agent owns it.
The catalog
Section titled “The catalog”The workspace admin opens Settings -> Shared agent resources. The first screen lists every resource kind the platform knows how to install, in a catalog format:
| Kind | Versions | Provider |
|---|---|---|
| Postgres | 16, 15 | statefulset |
| Redis | 7 | statefulset |
| (future) MinIO | … | … |
| (future) Kafka | … | … |
Each entry in the catalog is a kind with one or more versions and a selected provider (the adapter that implements the actual provisioning). v1 ships Postgres and Redis, each with a single statefulset provider that works out of the box on OrbStack and any CNI-backed cluster.
The catalog is code-embedded for v1 (packages/domains/agent-resources/catalog.ts). A future release may allow operators to register additional kinds via a ConfigMap; that change is additive.
Installing a resource
Section titled “Installing a resource”sequenceDiagram
participant UI as Browser
participant API
participant K8s as K8s API
participant CP as Control plane Postgres
UI->>API: POST /api/workspaces/:slug/shared-agent-resources<br/>{ kind: "postgres", version: "16", config: {...} }
API->>API: validate kind + version + config
API->>CP: INSERT workspace_shared_resources<br/>(workspace_id, kind, version, config, status=provisioning)
API->>K8s: apply StatefulSet + Service + Secret (admin creds)
K8s-->>API: resources created
API->>K8s: wait for StatefulSet ready
K8s-->>API: ready
API->>CP: UPDATE ... SET status='running'
API-->>UI: 201 Created
At install time the API generates an admin credential (a random 32-byte password) and writes it into a Kubernetes Secret in ws-<id>. That Secret is mounted into the API’s minter path, not into any session pod. The admin credential is used only for provisioning per-branch roles and databases, never exposed to agents.
Reinstall of the same kind is rejected in v1; one instance per kind per workspace. If the admin wants a different version they uninstall first, which drops the StatefulSet and every branch database with it.
Uninstalling a resource
Section titled “Uninstalling a resource”Explicit and destructive. The UI shows a confirmation with the list of branch databases that will be dropped. On confirm, the API deletes the StatefulSet, deletes the PVC, deletes every workspace_*_branch_* row for that resource, and deletes the admin credential Secret. No soft-delete, no snapshot, no archive. Admins who want to preserve data should dump before uninstall.
Per-branch isolation
Section titled “Per-branch isolation”Every session knows its (repo, branch) from agent config. At pod-spec generation, for each shared resource attached to the agent, the job-watcher provisions a branch-scoped slice.
Postgres: one database + one role per (repo, branch)
Section titled “Postgres: one database + one role per (repo, branch)”The minter runs three statements, idempotently:
-- if the branch database does not exist, clone from the main templateCREATE DATABASE <repo>_<branch_id> WITH TEMPLATE <repo>_main OWNER postgres;
-- if the role does not exist, create it; otherwise rotate its passwordCREATE ROLE <repo>_<branch_id> LOGIN PASSWORD '<rotated>';-- or: ALTER ROLE <repo>_<branch_id> WITH PASSWORD '<rotated>';
-- grant ownership of the branch database to the role, idempotentALTER DATABASE <repo>_<branch_id> OWNER TO <repo>_<branch_id>;REVOKE ALL ON DATABASE <repo>_<branch_id> FROM PUBLIC;<branch_id> is a sanitized-plus-hashed form of the branch name: feat/new-api becomes feat_new_api_a1b2c3d4, where the suffix is the first eight hex chars of a hash of the original branch name. This avoids postgres’s 63-byte identifier limit and guarantees uniqueness between feat/x-y and feat-x_y.
<repo>_main is the template database seeded at resource install time. It is owned by postgres and never written to directly by any branch role. Agents that need to change the baseline schema do so by opening a PR on a migration file; the PR merges to main; the next agent starting a session against main sees the change.
Redis: one ACL user + one key prefix per (repo, branch)
Section titled “Redis: one ACL user + one key prefix per (repo, branch)”ACL SETUSER <repo>_<branch_id> on >password ~<repo>_<branch_id>:* # key pattern &<repo>_<branch_id>:* # pub/sub channel pattern +@all # all commands by default -@dangerous # strip FLUSHDB / FLUSHALL / SHUTDOWN / DEBUG / CONFIGThe ~ and & clauses are enforced at the server. An agent that tries KEYS * or FLUSHDB receives a NOPERM error. The shared instance cannot be accidentally or maliciously nuked by a branch-scoped agent.
Credential delivery
Section titled “Credential delivery”Credentials reach agent containers through the valueFrom.secretKeyRef path, not through sidecar injection. This keeps the rule “secrets reach containers only via Kubernetes-native Secret references” intact.
sequenceDiagram
participant JW as Job watcher
participant M as Credential minter (API)
participant WS as Workspace Postgres
participant K8s as K8s API
participant Pod as Session pod
JW->>M: mint session creds (session_id, resources)
M->>WS: CREATE DATABASE / ALTER ROLE (idempotent)
M->>WS: new per-session password
M-->>JW: DSN strings
JW->>K8s: create Secret session-creds-<id><br/>(labeled session_id=<id>)
JW->>K8s: create Pod referencing that Secret<br/>via valueFrom.secretKeyRef
K8s->>Pod: DATABASE_URL, REDIS_URL in env
Note over Pod: agent starts with env populated
On session completion the session reaper deletes the per-session Secret. The branch role and database remain; only the password has rotated.
What the agent sees
Section titled “What the agent sees”Inside the session pod, the agent sees plain environment variables:
DATABASE_URL=postgresql://feat_new_api_a1b2c3d4:PASS@postgres.ws-abc.svc.cluster.local:5432/feat_new_api_a1b2c3d4REDIS_URL=redis://feat_new_api_a1b2c3d4:PASS@redis.ws-abc.svc.cluster.local:6379/0Plus a two-paragraph block appended to the agent’s system prompt:
You have a Postgres database at
$DATABASE_URL. It is scoped to your current branch. Migrations, schemas, fixtures, and any other state you create persist across sessions on this branch. On branch deletion the database is dropped; no other action on your part is needed.You have a Redis cache at
$REDIS_URL. All keys are prefixed automatically by your branch scope; you read and write unprefixed keys and the server handles isolation.
Data model
Section titled “Data model”Three tables in the control-plane database track state; none of them hold secrets.
-- one row per resource an admin has installedCREATE TABLE workspace_shared_resources ( id UUID PRIMARY KEY, workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE, kind TEXT NOT NULL, -- 'postgres' | 'redis' | ... version TEXT NOT NULL, -- '16', '7', ... config JSONB NOT NULL, -- storage_size, resource limits, etc. provider TEXT NOT NULL, -- adapter id, e.g. 'statefulset' admin_secret_ref TEXT NOT NULL, -- K8s Secret name, NOT the value status TEXT NOT NULL, -- 'provisioning' | 'running' | 'failed' installed_by UUID NOT NULL REFERENCES users(id), installed_at TIMESTAMPTZ NOT NULL, UNIQUE (workspace_id, kind));
-- per-branch postgres metadataCREATE TABLE workspace_postgres_branches ( id UUID PRIMARY KEY, resource_id UUID NOT NULL REFERENCES workspace_shared_resources(id) ON DELETE CASCADE, repo_id UUID NOT NULL REFERENCES agent_repos(id) ON DELETE CASCADE, branch_name TEXT NOT NULL, branch_id TEXT NOT NULL, -- sanitized+hashed; used as db and role name last_used_at TIMESTAMPTZ NOT NULL, reaped_at TIMESTAMPTZ, UNIQUE (resource_id, repo_id, branch_name));
-- per-branch redis metadataCREATE TABLE workspace_redis_branches ( id UUID PRIMARY KEY, resource_id UUID NOT NULL REFERENCES workspace_shared_resources(id) ON DELETE CASCADE, repo_id UUID NOT NULL REFERENCES agent_repos(id) ON DELETE CASCADE, branch_name TEXT NOT NULL, branch_id TEXT NOT NULL, last_used_at TIMESTAMPTZ NOT NULL, reaped_at TIMESTAMPTZ, UNIQUE (resource_id, repo_id, branch_name));Per-engine tables on purpose; a single generic workspace_branch_resources table would lose information every engine wants at query time.
Branch reaper
Section titled “Branch reaper”A branch database or ACL user persists until the branch no longer exists in its repo. A daily reaper sweep:
- For each
workspace_postgres_branchesandworkspace_redis_branchesrow wherereaped_at IS NULL, - List branches in the repo via the workspace’s GitHub credential proxy.
- If the branch is not in the list, reap:
- Postgres:
DROP DATABASE <branch_id>; DROP ROLE <branch_id>; - Redis:
ACL DELUSER <branch_id>;then asyncSCAN+UNLINKon the prefix.
- Postgres:
- Set
reaped_at.
Webhooks are an optional faster path: the workspace admin can configure GitHub to POST to /api/workspaces/:slug/webhooks/github, and the API reacts to delete events on the same day. Webhooks are a Phase-2 convenience; the daily reaper is the authoritative cleanup.
Manual reset is also available: POST /api/workspaces/:slug/shared-agent-resources/postgres/branches/:name/reset drops the branch database and re-creates it from the main template. Useful when an agent wrecks its own state.
Provider shape
Section titled “Provider shape”Each engine lives in its own bounded context under packages/domains/:
packages/domains/agent-resources-postgres/ src/ domain/ Identifiers, DSN value object, branch-id hashing ports/ AdminProvisioner (installer-side: install/uninstall, set up main DB) BranchMinter (session-side: ensure branch DB + role, rotate password) adapters/ statefulset/ concrete adapter for the in-cluster StatefulSet engine fakes/ in-memory fake for unit tests
packages/domains/agent-resources-redis/ (same shape)The port shape is engine-specific because the primitives differ: Postgres provisions databases and roles; Redis provisions ACL users. Attempting to unify them at the port level loses too much information.
v1 ships one adapter per engine and therefore no contract-test suite yet. CLAUDE.md requires a contract suite once a port has multiple adapters; with one adapter, the adapter’s own integration tests are the contract. A second adapter (CloudNativePG for Postgres, Upstash for Redis, Neon for external-managed, etc.) triggers the move to a contract suite in the port package. Future docs will describe that migration.
Helm values select the adapter:
sharedAgentResources: postgres: adapter: statefulset # the only option in v1 redis: adapter: statefulsetA fresh OrbStack install with default values has zero resources provisioned; nothing runs in ws-<id> until an admin clicks Install. Resource usage scales with actual demand.
OrbStack dev
Section titled “OrbStack dev”Everything in this document works under mise run dev on OrbStack with no operator install. The statefulset adapters use the default storage class (OrbStack provisions hostPath PVs on demand). First install of Postgres 16 pulls postgres:16 from Docker Hub — local pull-through cache will cut this to seconds after first use once the in-cluster registry is deployed. Redis 7 is the same.
Integration tests against the real StatefulSet run in CI against an OrbStack cluster spun up by the devcontainer; no mocking at the engine level.
Siblings vs. shared agent resources
Section titled “Siblings vs. shared agent resources”Siblings are ephemeral service containers that live inside the session pod and die with it. Shared agent resources are long-running services that live in the workspace namespace and survive every session.
| Concern | Use siblings | Use shared agent resources |
|---|---|---|
| ”Every session gets a clean Postgres, no state carry-over” | yes | no |
| ”Agents iterate on the same schema across many sessions” | no | yes |
| ”MailHog to capture outbound email during tests” | yes | no |
| ”Fake S3 for test fixtures” | yes | no |
| ”Headless chromium for scraping” | yes | no |
| ”Shared cache that accumulates across runs” | no | yes |
When both patterns could work, prefer shared agent resources for stateful engines and siblings for ephemeral test fixtures. A workspace that runs serious application development installs Postgres and Redis as shared resources and uses siblings for MailHog, fake S3, and a headless browser.
Every install, uninstall, branch provision, branch reap, and branch reset emits an audit event. Event shape mirrors permission grants:
workspace.shared_resource.installed— actor, workspace, kind, version.workspace.shared_resource.uninstalled— count of branch databases destroyed.workspace.shared_resource.branch.provisioned— resource_id, repo_id, branch.workspace.shared_resource.branch.reset— manual drop-and-re-template.workspace.shared_resource.branch.reaped— by clean branch-delete or by the daily sweep (reason field distinguishes).
No secret values appear in any event.
Failure modes
Section titled “Failure modes”| Failure | Behavior |
|---|---|
| StatefulSet never becomes ready (PVC pending, node pressure) | Resource stays in status='provisioning' for up to 10 minutes, then flips to failed with a structured reason. Admin can retry or uninstall. |
| Branch minter SQL fails at session start | Session pod is not created. Session status becomes failed with a structured reason event on the session detail page. |
| Agent exhausts disk in its branch database | Pod-level PVC fills up across all branches; admin sees the health alert on the resource row. Recovery: expand PVC, or uninstall + reinstall with larger storage. |
| Reaper’s GitHub list call fails | Reaper logs and retries on next sweep. Branch data persists; no destructive action taken under uncertainty. |
Webhook delete event received for a branch that still exists (GitHub race) | Idempotent; reaper’s next scheduled sweep is authoritative. A webhook-driven drop is only executed after a confirmation list call. |
What this is not
Section titled “What this is not”- A deploy target for apps the agent builds. Shared agent resources are engines agents connect to, not workloads the agent ships. For the parked proposal on agent-shipped workloads and branch preview deploys, see
proposals/branch-deploys.md. - A secret store. Workspace secrets live in the MCP servers workspace secret store; admin DSN credentials for shared resources are stored in Kubernetes Secrets and referenced by name, not copy.
- A full-fidelity postgres/redis operator. The
statefulsetadapter is deliberately minimal (single replica, no HA, no backups). Operators who need production-grade stateful engines should, when we ship it, select thecloudnative-pgor operator-based adapter for Postgres and the equivalent for Redis. - A cross-workspace sharing boundary. Every resource is workspace-scoped. Agents in workspace A cannot reach workspace B’s Postgres even if both have Postgres installed.
Summary
Section titled “Summary”- Two Postgres classes in the cluster, never conflated: the control-plane database and shared agent resources.
- Agent pods are denied network egress to the control-plane database by NetworkPolicy and hold no credentials for it.
- Admins install resources from a catalog (Postgres 16, Redis 7 in v1) from Settings -> Shared agent resources.
- Each installed resource is a long-running StatefulSet + Service in the workspace namespace.
- Sessions on a given
(repo, branch)receive a scoped credential to a per-branch database (Postgres) or ACL user (Redis). State persists across sessions on the same branch and is reaped when the branch is deleted upstream. - Agents receive credentials as
DATABASE_URL/REDIS_URLenv vars viavalueFrom.secretKeyRef; the agent system prompt is augmented with a two-paragraph usage block. - One bounded context per engine (
agent-resources-postgres,agent-resources-redis); one adapter per engine in v1; contract suite added when a second adapter lands. - Deploying apps agents build back into the cluster is out of scope for this architecture; see the parked proposal.