Deployment Guide¶

This guide covers running Cognition in production with PostgreSQL persistence, Docker sandbox execution, and the full observability stack.

Overview¶

The production topology includes 8 services:

Service	Image	Purpose
`cognition`	`cognition:latest`	The API server
`postgres`	`postgres:16`	Durable session and message storage
`mlflow`	`ghcr.io/mlflow/mlflow:v3.10.0`	Experiment tracking
`prometheus`	`prom/prometheus:latest`	Metrics collection
`grafana`	`grafana/grafana:latest`	Dashboards
`otel-collector`	`otel/opentelemetry-collector-contrib:latest`	Trace collection and routing
`loki`	`grafana/loki:latest`	Log aggregation
`promtail`	`grafana/promtail:latest`	Log shipping from containers

All services communicate on a cognition-network bridge network.

Prerequisites¶

Docker Engine 24+ and Docker Compose v2
At least 4 GB free RAM for the full stack (2 GB minimum for Cognition + Postgres only)
An LLM provider API key

Step 1 — Build the Sandbox Image¶

The sandbox image is required when COGNITION_SANDBOX_BACKEND=docker. It defines the execution environment for agent code.

docker build -f Dockerfile.sandbox -t cognition-sandbox:latest .

The sandbox image is minimal by design: a read-only root filesystem, no shell, no network tools, and only the packages needed to run Python code.

Step 2 — Build the Cognition Image¶

docker build -t cognition:latest .

The Dockerfile is a multi-stage build. The final image contains only the application and its runtime dependencies.

Step 3 — Configure Environment¶

Copy .env.example to .env and fill in your values:

cp .env.example .env

Minimum required settings:

# LLM provider (set in .cognition/config.yaml or via API)
# Config.yaml llm: section is seeded into ConfigRegistry on first startup
OPENAI_API_KEY=sk-...

# Database (matches docker-compose.yml service)
COGNITION_PERSISTENCE_BACKEND=postgres
COGNITION_PERSISTENCE_URI=postgresql://cognition:cognition@postgres:5432/cognition
POSTGRES_USER=cognition
POSTGRES_PASSWORD=cognition
POSTGRES_DB=cognition

# Sandbox
COGNITION_SANDBOX_BACKEND=docker

# Observability (optional but recommended)
COGNITION_OTEL_ENABLED=true
COGNITION_OTEL_ENDPOINT=http://otel-collector:4317
COGNITION_MLFLOW_ENABLED=true
COGNITION_MLFLOW_TRACKING_URI=http://mlflow:5000

Step 4 — Start the Stack¶

Full Stack (all 8 services)¶

docker-compose up -d

Minimal Stack (Cognition + Postgres only)¶

docker-compose up -d cognition postgres

Verify Health¶

# Cognition API
curl -s http://localhost:8000/health | jq .

# PostgreSQL
docker-compose exec postgres pg_isready -U cognition

# MLflow
curl -s http://localhost:5000/health

# Prometheus
curl -s http://localhost:9090/-/ready

# Grafana
curl -s http://localhost:3000/api/health

Step 5 — Database Migrations¶

Cognition uses Alembic for schema management. Migrations run automatically at startup — the SqliteStorageBackend and PostgresStorageBackend both call metadata.create_all() during initialize().

For explicit migration management:

# Apply latest schema
docker-compose exec cognition cognition db upgrade

# Check current revision
docker-compose exec cognition cognition db current

# Create a new migration (after changing schema.py)
docker-compose exec cognition cognition db migrate "description"

Service Configuration Details¶

Cognition Server¶

The cognition service in docker-compose.yml mounts: - /var/run/docker.sock — Required for the Docker sandbox backend to create containers - ./workspace — Host workspace directory mapped into the container

The Docker-in-Docker socket mount requires that the host's Docker daemon is accessible and that the cognition user has permission to use it.

PostgreSQL¶

postgres:
  image: postgres:16
  environment:
    POSTGRES_USER: cognition
    POSTGRES_PASSWORD: cognition
    POSTGRES_DB: cognition
  volumes:
    - pgdata:/var/lib/postgresql/data
  healthcheck:
    test: ["CMD-SHELL", "pg_isready -U cognition"]
    interval: 10s
    retries: 5

Data is persisted in the named pgdata volume. Back up this volume before upgrades.

OTel Collector¶

The collector receives OTLP gRPC on port 4317, processes traces, and exports them to: - MLflow (via OTLP HTTP) - Loki (logs via the loki exporter)

Configuration: docker/otel-collector-config.yml.

Grafana¶

Pre-built dashboards are provisioned automatically from docker/grafana/dashboards/. The Grafana admin UI is available at http://localhost:3000 (default credentials: admin/admin).

Production Hardening¶

Network Isolation¶

Sandbox containers run with --network none by default, preventing agents from accessing the internet or internal services:

COGNITION_DOCKER_NETWORK=none

If agents need internet access (e.g. for web search), create a dedicated restricted network instead of using bridge:

docker network create --driver bridge --opt com.docker.network.bridge.name=agent-net \
  --subnet 172.20.0.0/24 agent-restricted

COGNITION_DOCKER_NETWORK=agent-restricted

Resource Limits¶

Prevent runaway agent workloads from starving other services:

COGNITION_DOCKER_MEMORY_LIMIT=1g
COGNITION_DOCKER_CPU_LIMIT=2.0
COGNITION_DOCKER_TIMEOUT=300

Session Scoping¶

Enable multi-tenant isolation with builder-defined scope keys:

COGNITION_SCOPING_ENABLED=true
COGNITION_SCOPE_KEYS=["user", "project"]

Scope keys are builder-defined — choose keys that match your tenancy model. Your upstream API gateway or reverse proxy must inject X-Cognition-Scope-{key} headers for each configured key based on your authentication layer.

TLS / Reverse Proxy¶

Cognition does not terminate TLS. Run it behind a reverse proxy (Nginx, Caddy, AWS ALB) that handles TLS termination.

Nginx example:

location / {
    proxy_pass http://cognition:8000;
    proxy_http_version 1.1;
    proxy_set_header Connection "";     # Required for SSE
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_buffering off;                # Required for SSE
    proxy_cache off;                    # Required for SSE
    proxy_read_timeout 300s;            # Long timeout for SSE streams
}

SSE requires proxy_buffering off. Without this, tokens will not stream to clients.

CORS¶

Set specific origins in production:

COGNITION_CORS_ORIGINS=["https://app.example.com"]
COGNITION_CORS_ALLOW_CREDENTIALS=true

Secret Management¶

Never put API keys in YAML config files. Use: - .env files (for Docker Compose; excluded from version control via .gitignore) - Docker secrets for Swarm deployments - AWS Secrets Manager / HashiCorp Vault for Kubernetes

Kubernetes¶

Deployment¶

The Cognition container is stateless (all state in PostgreSQL). Use a standard Deployment with horizontal scaling:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cognition
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: cognition
          image: cognition:latest
          ports:
            - containerPort: 8000
            - containerPort: 9090   # Prometheus metrics
          env:
            - name: COGNITION_PERSISTENCE_BACKEND
              value: postgres
            - name: COGNITION_PERSISTENCE_URI
              valueFrom:
                secretKeyRef:
                  name: cognition-secrets
                  key: database-url

Kubernetes Sandbox Backend¶

When deploying Cognition on Kubernetes, the Docker sandbox backend does not work (the server pod runs with readOnlyRootFilesystem: true, capabilities.drop: ["ALL"], and runAsNonRoot: true). Use the kubernetes sandbox backend instead, which creates isolated sandbox pods via the agent-sandbox CRD and controller.

Prerequisites (install before deploying Cognition):

Prerequisite	Install	Purpose
agent-sandbox controller (v0.3.10+)	`kubectl apply -f .../v0.3.10/manifest.yaml`	Reconciles Sandbox CRs into pods
agent-sandbox extensions	`kubectl apply -f .../v0.3.10/extensions.yaml`	SandboxTemplate, SandboxClaim CRDs
sandbox-router Deployment + Service	Deploy from agent-sandbox router	Proxies commands to sandbox pods

These are cluster-scoped infrastructure and are not bundled in Cognition's Helm chart.

SandboxTemplate — create a CR defining the sandbox pod spec:

apiVersion: extensions.agents.x-k8s.io/v1alpha1
kind: SandboxTemplate
metadata:
  name: cognition-sandbox
  namespace: cognition
spec:
  networkPolicyManagement: Managed
  podTemplate:
    spec:
      containers:
      - name: python-runtime
        image: us-central1-docker.pkg.dev/k8s-staging-images/agent-sandbox/python-runtime-sandbox:latest-main
        ports:
        - containerPort: 8888
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          capabilities:
            drop: ["ALL"]
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: workspace
          mountPath: /workspace
      volumes:
      - name: tmp
        emptyDir:
          sizeLimit: "128Mi"
      - name: workspace
        emptyDir:
          sizeLimit: "1Gi"

The /tmp and /workspace emptyDir mounts are required. The runtime image uses readOnlyRootFilesystem: true, so without writable mount points, file operations that write temporary data will fail.

Helm values — enable the K8s sandbox backend:

config:
  sandbox:
    backend: kubernetes
    k8s:
      template: cognition-sandbox
      namespace: cognition
      routerUrl: http://sandbox-router-svc.cognition.svc.cluster.local:8080
      ttl: 3600
      denyEgress: true    # Optional: deny all egress from sandbox pods

The Helm chart automatically creates the required RBAC (namespace-scoped Role for sandbox lifecycle + cluster-scoped ClusterRole for CRD reads) when backend=kubernetes.

Startup validation — Cognition checks at startup that the agent-sandbox CRDs exist and the router is reachable. If CRDs are missing, the server fails to start with a clear error message.

See Kubernetes Sandbox for architecture details, scoping labels, and the two-package design.

Health Probes¶

livenessProbe:
  httpGet:
    path: /health
    port: 8000
  initialDelaySeconds: 10
  periodSeconds: 30

readinessProbe:
  httpGet:
    path: /ready
    port: 8000
  initialDelaySeconds: 5
  periodSeconds: 10

Monitoring¶

Prometheus Scrape Config¶

scrape_configs:
  - job_name: cognition
    static_configs:
      - targets: ["cognition:9090"]
    scrape_interval: 15s

Key Metrics to Alert On¶

Metric	Alert Condition	Description
`cognition_requests_total{status=~"5.."}`	Rate > 0 sustained	Server-side errors
`cognition_llm_call_duration_seconds`	p99 > 30s	LLM latency degradation
`cognition_tool_calls_total{status="error"}`	Rate spike	Tool execution failures
`cognition_active_sessions`	Near `COGNITION_MAX_SESSIONS`	Session limit approaching

Upgrading¶

Pull the new image: docker pull cognition:latest
Run migrations: docker-compose exec cognition cognition db upgrade
Rolling restart: docker-compose up -d --no-deps cognition

The StorageBackend.initialize() call at startup is idempotent — it is safe to run against an existing database.