Deployment Guide¶
This guide covers running Cognition in production with PostgreSQL persistence, Docker sandbox execution, and the full observability stack.
Overview¶
The production topology includes 8 services:
| Service | Image | Purpose |
|---|---|---|
cognition |
cognition:latest |
The API server |
postgres |
postgres:16 |
Durable session and message storage |
mlflow |
ghcr.io/mlflow/mlflow:v3.10.0 |
Experiment tracking |
prometheus |
prom/prometheus:latest |
Metrics collection |
grafana |
grafana/grafana:latest |
Dashboards |
otel-collector |
otel/opentelemetry-collector-contrib:latest |
Trace collection and routing |
loki |
grafana/loki:latest |
Log aggregation |
promtail |
grafana/promtail:latest |
Log shipping from containers |
All services communicate on a cognition-network bridge network.
Prerequisites¶
- Docker Engine 24+ and Docker Compose v2
- At least 4 GB free RAM for the full stack (2 GB minimum for Cognition + Postgres only)
- An LLM provider API key
Step 1 — Build the Sandbox Image¶
The sandbox image is required when COGNITION_SANDBOX_BACKEND=docker. It defines the execution environment for agent code.
The sandbox image is minimal by design: a read-only root filesystem, no shell, no network tools, and only the packages needed to run Python code.
Step 2 — Build the Cognition Image¶
The Dockerfile is a multi-stage build. The final image contains only the application and its runtime dependencies.
Step 3 — Configure Environment¶
Copy .env.example to .env and fill in your values:
Minimum required settings:
# LLM provider (set in .cognition/config.yaml or via API)
# Config.yaml llm: section is seeded into ConfigRegistry on first startup
OPENAI_API_KEY=sk-...
# Database (matches docker-compose.yml service)
COGNITION_PERSISTENCE_BACKEND=postgres
COGNITION_PERSISTENCE_URI=postgresql://cognition:cognition@postgres:5432/cognition
POSTGRES_USER=cognition
POSTGRES_PASSWORD=cognition
POSTGRES_DB=cognition
# Sandbox
COGNITION_SANDBOX_BACKEND=docker
# Observability (optional but recommended)
COGNITION_OTEL_ENABLED=true
COGNITION_OTEL_ENDPOINT=http://otel-collector:4317
COGNITION_MLFLOW_ENABLED=true
COGNITION_MLFLOW_TRACKING_URI=http://mlflow:5000
Step 4 — Start the Stack¶
Full Stack (all 8 services)¶
Minimal Stack (Cognition + Postgres only)¶
Verify Health¶
# Cognition API
curl -s http://localhost:8000/health | jq .
# PostgreSQL
docker-compose exec postgres pg_isready -U cognition
# MLflow
curl -s http://localhost:5000/health
# Prometheus
curl -s http://localhost:9090/-/ready
# Grafana
curl -s http://localhost:3000/api/health
Step 5 — Database Migrations¶
Cognition uses Alembic for schema management. Migrations run automatically at startup — the SqliteStorageBackend and PostgresStorageBackend both call metadata.create_all() during initialize().
For explicit migration management:
# Apply latest schema
docker-compose exec cognition cognition db upgrade
# Check current revision
docker-compose exec cognition cognition db current
# Create a new migration (after changing schema.py)
docker-compose exec cognition cognition db migrate "description"
Service Configuration Details¶
Cognition Server¶
The cognition service in docker-compose.yml mounts:
- /var/run/docker.sock — Required for the Docker sandbox backend to create containers
- ./workspace — Host workspace directory mapped into the container
The Docker-in-Docker socket mount requires that the host's Docker daemon is accessible and that the cognition user has permission to use it.
PostgreSQL¶
postgres:
image: postgres:16
environment:
POSTGRES_USER: cognition
POSTGRES_PASSWORD: cognition
POSTGRES_DB: cognition
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U cognition"]
interval: 10s
retries: 5
Data is persisted in the named pgdata volume. Back up this volume before upgrades.
OTel Collector¶
The collector receives OTLP gRPC on port 4317, processes traces, and exports them to:
- MLflow (via OTLP HTTP)
- Loki (logs via the loki exporter)
Configuration: docker/otel-collector-config.yml.
Grafana¶
Pre-built dashboards are provisioned automatically from docker/grafana/dashboards/. The Grafana admin UI is available at http://localhost:3000 (default credentials: admin/admin).
Production Hardening¶
Network Isolation¶
Sandbox containers run with --network none by default, preventing agents from accessing the internet or internal services:
If agents need internet access (e.g. for web search), create a dedicated restricted network instead of using bridge:
docker network create --driver bridge --opt com.docker.network.bridge.name=agent-net \
--subnet 172.20.0.0/24 agent-restricted
Resource Limits¶
Prevent runaway agent workloads from starving other services:
Session Scoping¶
Enable multi-tenant isolation with builder-defined scope keys:
Scope keys are builder-defined — choose keys that match your tenancy model. Your upstream API gateway or reverse proxy must inject X-Cognition-Scope-{key} headers for each configured key based on your authentication layer.
TLS / Reverse Proxy¶
Cognition does not terminate TLS. Run it behind a reverse proxy (Nginx, Caddy, AWS ALB) that handles TLS termination.
Nginx example:
location / {
proxy_pass http://cognition:8000;
proxy_http_version 1.1;
proxy_set_header Connection ""; # Required for SSE
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_buffering off; # Required for SSE
proxy_cache off; # Required for SSE
proxy_read_timeout 300s; # Long timeout for SSE streams
}
SSE requires proxy_buffering off. Without this, tokens will not stream to clients.
CORS¶
Set specific origins in production:
Secret Management¶
Never put API keys in YAML config files. Use:
- .env files (for Docker Compose; excluded from version control via .gitignore)
- Docker secrets for Swarm deployments
- AWS Secrets Manager / HashiCorp Vault for Kubernetes
Kubernetes¶
Deployment¶
The Cognition container is stateless (all state in PostgreSQL). Use a standard Deployment with horizontal scaling:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cognition
spec:
replicas: 3
template:
spec:
containers:
- name: cognition
image: cognition:latest
ports:
- containerPort: 8000
- containerPort: 9090 # Prometheus metrics
env:
- name: COGNITION_PERSISTENCE_BACKEND
value: postgres
- name: COGNITION_PERSISTENCE_URI
valueFrom:
secretKeyRef:
name: cognition-secrets
key: database-url
Kubernetes Sandbox Backend¶
When deploying Cognition on Kubernetes, the Docker sandbox backend does not work (the server pod runs with readOnlyRootFilesystem: true, capabilities.drop: ["ALL"], and runAsNonRoot: true). Use the kubernetes sandbox backend instead, which creates isolated sandbox pods via the agent-sandbox CRD and controller.
Prerequisites (install before deploying Cognition):
| Prerequisite | Install | Purpose |
|---|---|---|
| agent-sandbox controller (v0.3.10+) | kubectl apply -f .../v0.3.10/manifest.yaml |
Reconciles Sandbox CRs into pods |
| agent-sandbox extensions | kubectl apply -f .../v0.3.10/extensions.yaml |
SandboxTemplate, SandboxClaim CRDs |
| sandbox-router Deployment + Service | Deploy from agent-sandbox router | Proxies commands to sandbox pods |
These are cluster-scoped infrastructure and are not bundled in Cognition's Helm chart.
SandboxTemplate — create a CR defining the sandbox pod spec:
apiVersion: extensions.agents.x-k8s.io/v1alpha1
kind: SandboxTemplate
metadata:
name: cognition-sandbox
namespace: cognition
spec:
networkPolicyManagement: Managed
podTemplate:
spec:
containers:
- name: python-runtime
image: us-central1-docker.pkg.dev/k8s-staging-images/agent-sandbox/python-runtime-sandbox:latest-main
ports:
- containerPort: 8888
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
capabilities:
drop: ["ALL"]
volumeMounts:
- name: tmp
mountPath: /tmp
- name: workspace
mountPath: /workspace
volumes:
- name: tmp
emptyDir:
sizeLimit: "128Mi"
- name: workspace
emptyDir:
sizeLimit: "1Gi"
The
/tmpand/workspaceemptyDir mounts are required. The runtime image usesreadOnlyRootFilesystem: true, so without writable mount points, file operations that write temporary data will fail.
Helm values — enable the K8s sandbox backend:
config:
sandbox:
backend: kubernetes
k8s:
template: cognition-sandbox
namespace: cognition
routerUrl: http://sandbox-router-svc.cognition.svc.cluster.local:8080
ttl: 3600
denyEgress: true # Optional: deny all egress from sandbox pods
The Helm chart automatically creates the required RBAC (namespace-scoped Role for sandbox lifecycle + cluster-scoped ClusterRole for CRD reads) when backend=kubernetes.
Startup validation — Cognition checks at startup that the agent-sandbox CRDs exist and the router is reachable. If CRDs are missing, the server fails to start with a clear error message.
See Kubernetes Sandbox for architecture details, scoping labels, and the two-package design.
Health Probes¶
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
Monitoring¶
Prometheus Scrape Config¶
scrape_configs:
- job_name: cognition
static_configs:
- targets: ["cognition:9090"]
scrape_interval: 15s
Key Metrics to Alert On¶
| Metric | Alert Condition | Description |
|---|---|---|
cognition_requests_total{status=~"5.."} |
Rate > 0 sustained | Server-side errors |
cognition_llm_call_duration_seconds |
p99 > 30s | LLM latency degradation |
cognition_tool_calls_total{status="error"} |
Rate spike | Tool execution failures |
cognition_active_sessions |
Near COGNITION_MAX_SESSIONS |
Session limit approaching |
Upgrading¶
- Pull the new image:
docker pull cognition:latest - Run migrations:
docker-compose exec cognition cognition db upgrade - Rolling restart:
docker-compose up -d --no-deps cognition
The StorageBackend.initialize() call at startup is idempotent — it is safe to run against an existing database.