Storage & Execution¶

Cognition decouples where state is stored from where code runs through two independent protocol abstractions: StorageBackend (Layer 2) and the execution backends (Layer 3). Both are pluggable — swap implementations via configuration with no code changes in any layer above.

StorageBackend Protocol¶

Defined in server/app/storage/backend.py. The protocol is composed of three sub-protocols, each responsible for a distinct concern:

SessionStore¶

class SessionStore(Protocol):
    async def create_session(
        self,
        thread_id: str,
        config: SessionConfig,
        title: str | None = None,
        scopes: dict[str, str] | None = None,
        agent_name: str = "default",
        metadata: dict[str, str] | None = None,
    ) -> Session: ...

    async def get_session(self, session_id: str) -> Session | None: ...

    async def list_sessions(
        self,
        filter_scopes: dict[str, str] | None = None,
        metadata_filters: dict[str, str] | None = None,
    ) -> list[Session]: ...

    async def update_session(
        self,
        session_id: str,
        title: str | None = None,
        status: str | None = None,
        config: SessionConfig | None = None,
        agent_name: str | None = None,
        metadata: dict[str, str] | None = None,
    ) -> Session | None: ...

    async def update_message_count(self, session_id: str, delta: int) -> None: ...

    async def delete_session(self, session_id: str) -> bool: ...

MessageStore¶

class MessageStore(Protocol):
    async def create_message(self, message: Message) -> Message: ...

    async def get_message(self, message_id: str) -> Message | None: ...

    async def get_messages_by_session(
        self,
        session_id: str,
        limit: int = 50,
        offset: int = 0,
    ) -> tuple[list[Message], int]: ...  # (messages, total_count)

    async def list_messages_for_session(self, session_id: str) -> list[Message]: ...

    async def delete_messages_for_session(self, session_id: str) -> int: ...

CheckpointerStore¶

class CheckpointerStore(Protocol):
    async def get_checkpointer(self) -> BaseCheckpointSaver: ...
    async def close_checkpointer(self) -> None: ...

The checkpointer is passed to LangGraph and stores agent state at every step — enabling resumable workflows that survive server restarts.

StoreBackend (Cross-Thread Memory)¶

class StorageBackend(Protocol):
    async def get_store(self) -> BaseStore | None: ...

Each storage backend also exposes get_store(), which returns a LangGraph BaseStore instance for cross-thread persistent memory. This is separate from the checkpointer: the checkpointer stores agent graph state (messages, tool call history) per thread; the Store holds long-lived structured data that spans threads and sessions.

Implementation	Store Backend
`MemoryStorageBackend`	`InMemoryStore` (ephemeral — suitable for tests and development)
`SqliteStorageBackend`	`AsyncSqliteStore` (persisted to same database file as checkpointer)
`PostgresStorageBackend`	`AsyncPostgresStore` (separate psycopg connection to same Postgres instance)

The Store is passed to create_deep_agent() and available inside agent nodes and middleware via runtime.store. Namespace scoping (via CognitionContext.effective_scope) ensures one tenant cannot read another's stored data. See CognitionContext and Cross-Thread Memory for details.

Unified StorageBackend¶

StorageBackend combines session, message, checkpoint, and Store operations plus lifecycle methods:

class StorageBackend(SessionStore, MessageStore, CheckpointerStore, Protocol):
    async def initialize(self) -> None: ...   # Create tables, pools, migrations
    async def close(self) -> None: ...        # Drain connections, release resources
    async def health_check(self) -> dict[str, Any]: ...

ArtifactStore¶

Artifacts are durable, scope-aware files managed separately from session/message data. The ArtifactStore provides CRUD and versioning for artifacts with six types: scratch, artifact, contract, eval, memory, policy.

Key properties: - Scope-aware — artifacts are filtered by effective_scope on every read - Versioned — content changes automatically increment the version number - Type-safe — artifact types control lifecycle semantics - Visibility-controlled — private, run, or public visibility levels

Artifacts are accessible via GET/POST/PUT/DELETE /artifacts and are exposed to agents through the tool system.

Message Projection Recovery¶

The LangGraph checkpoint is the authoritative record of runtime conversation state. The messages table is a read-optimized projection used by the API. Storage backends therefore support rebuilding the message projection from checkpoint messages when API-visible message rows drift or must be recovered after an interrupted write path.

This lets Cognition repair user-visible session history without treating the messages table as the source of truth for runtime continuity.

Storage Implementations¶

server/app/storage/factory.py — create_storage_backend(settings) creates the backend:

match settings.persistence_backend:
    case "sqlite":   return SqliteStorageBackend(settings)
    case "postgres": return PostgresStorageBackend(settings)
    case "memory":   return MemoryStorageBackend(settings)
    case _:          raise StorageBackendError(f"Unknown backend: ...")

No silent fallback. An unrecognised COGNITION_PERSISTENCE_BACKEND value raises immediately at startup.

SQLite (`server/app/storage/sqlite.py`)¶

Default development backend.

Async I/O via aiosqlite
LangGraph checkpoints via AsyncSqliteSaver
Database path resolved relative to workspace if not absolute
Parent directories created automatically
Suitable for single-node deployments; not safe for concurrent multi-process access

Configuration:

COGNITION_PERSISTENCE_BACKEND=sqlite
COGNITION_PERSISTENCE_URI=.cognition/state.db

PostgreSQL (`server/app/storage/postgres.py`)¶

Production backend for multi-node or high-availability deployments.

Async I/O via asyncpg connection pool (default: 1–10 connections)
LangGraph checkpoints via AsyncPostgresSaver
Schema managed by Alembic migrations
DSN normalisation: postgresql+asyncpg:// → postgresql:// for asyncpg compatibility

Configuration:

COGNITION_PERSISTENCE_BACKEND=postgres
COGNITION_PERSISTENCE_URI=postgresql://user:pass@host:5432/cognition

Memory (`server/app/storage/memory.py`)¶

In-process dict-backed store used in unit tests.

Zero dependencies
State lost on process exit
Fastest possible; no I/O overhead

Configuration:

COGNITION_PERSISTENCE_BACKEND=memory

ExecutionBackend¶

Code execution is isolated from the server process. Cognition uses two backend types, both ultimately relying on DockerExecutionBackend for hard isolation.

DockerExecutionBackend (`server/app/execution/backend.py`)¶

Runs commands in a Docker container with full kernel-level isolation:

Security Control	Value
Linux capabilities	All dropped (`cap_drop: ALL`)
Privilege escalation	Blocked (`no-new-privileges: true`)
Root filesystem	Read-only (`read_only: true`)
Writable paths	`/tmp` and `/home` via `tmpfs` mounts
Network	Configurable; `none` by default
Memory limit	Configurable (default: `512m`)
CPU limit	Configurable (default: `1.0` core)

Container lifecycle: the backend checks for an existing running container for the session before creating a new one. Containers are reused within a session for performance. Command output is truncated at 100 KB.

Sandbox Backends (`server/app/agent/sandbox_backend.py`)¶

The two sandbox backends are Cognition's concrete wrappers around the execution abstraction:

CognitionLocalSandboxBackend¶

Commands execute in the local process under the server's user.

Command parsing with shlex.split() + subprocess with shell=False — no shell injection possible
Protected paths list (.cognition/ by default): write operations that target protected paths are blocked before execution
File operations operate directly on the host filesystem
No process isolation from the Cognition server process

Best for: local development, trusted codebases, CI pipelines.

COGNITION_SANDBOX_BACKEND=local

CognitionDockerSandboxBackend¶

File operations run directly on the host filesystem (for performance); command execution is routed through DockerExecutionBackend (for isolation).

Each session gets its own container (lazy creation on first command)
Container is reused for the session lifetime
Requires Docker daemon and cognition-sandbox:latest image
host_workspace setting maps the workspace path into the container

Best for: production, multi-tenant deployments, any untrusted code.

COGNITION_SANDBOX_BACKEND=docker
COGNITION_DOCKER_IMAGE=cognition-sandbox:latest
COGNITION_DOCKER_NETWORK=none
COGNITION_DOCKER_MEMORY_LIMIT=512m
COGNITION_DOCKER_CPU_LIMIT=1.0
COGNITION_DOCKER_TIMEOUT=300

Factory¶

from server.app.agent.sandbox_backend import create_sandbox_backend

backend = create_sandbox_backend(settings)
# Returns CognitionLocalSandboxBackend or CognitionDockerSandboxBackend

How Storage and Execution Compose¶

A session involves both layers simultaneously:

Client sends message
        │
        ▼
Layer 6: API persists user message in StorageBackend
        │
        ▼
Layer 4: AgentRuntime streams events
        │
        ├── Tool call: bash("ls -la")
        │       └── Layer 3: SandboxBackend.execute("ls -la")
        │               └── Returns ExecutionResult(output, exit_code)
        │
        └── Stream complete (done event)
                └── Layer 6: API persists assistant message in StorageBackend

The storage and execution backends never call each other. Composition happens only at Layer 4 and Layer 6 — the correct level in the dependency hierarchy.

Built-in Tools¶

Beyond the sandbox backends, the agent has three built-in tools provided by server/app/agent/tools.py:

Tool	Class	Description
`browser`	`BrowserTool`	Fetch web pages as text, markdown, or HTML via `httpx`
`search`	`SearchTool`	DuckDuckGo web search, returns titles, links, and snippets
`inspect_package`	`InspectPackageTool`	Inspect Python packages: list submodules, classes, and functions

These run in the local process (not inside the Docker sandbox) and are always available regardless of sandbox_backend setting.

Circuit Breaker (`server/app/execution/circuit_breaker.py`)¶

The circuit breaker protects downstream services from cascading failures. It is used by the execution layer for Docker container management.

States:

CLOSED ──[failures ≥ threshold]──► OPEN
  ▲                                   │
  │                          [timeout expires]
  │                                   ▼
  └──[successes ≥ threshold]── HALF_OPEN

Default configuration: - failure_threshold: 5 consecutive failures to open - success_threshold: 3 consecutive successes to close from half-open - timeout_seconds: 60 s in open state before transitioning to half-open - half_open_max_calls: 3 test calls allowed in half-open state

Circuit breaker status is reported in /health:

{
  "circuit_breakers": {
    "openai": {
      "state": "closed",
      "total_calls": 142,
      "failed_calls": 1,
      "consecutive_failures": 0
    }
  }
}

Storage & Execution¶

StorageBackend Protocol¶

SessionStore¶

MessageStore¶

CheckpointerStore¶

StoreBackend (Cross-Thread Memory)¶

Unified StorageBackend¶

ArtifactStore¶

Message Projection Recovery¶

Storage Implementations¶

SQLite (server/app/storage/sqlite.py)¶

PostgreSQL (server/app/storage/postgres.py)¶

Memory (server/app/storage/memory.py)¶

ExecutionBackend¶

DockerExecutionBackend (server/app/execution/backend.py)¶

Sandbox Backends (server/app/agent/sandbox_backend.py)¶

CognitionLocalSandboxBackend¶

CognitionDockerSandboxBackend¶

Factory¶

How Storage and Execution Compose¶

Built-in Tools¶

Circuit Breaker (server/app/execution/circuit_breaker.py)¶

SQLite (`server/app/storage/sqlite.py`)¶

PostgreSQL (`server/app/storage/postgres.py`)¶

Memory (`server/app/storage/memory.py`)¶

DockerExecutionBackend (`server/app/execution/backend.py`)¶

Sandbox Backends (`server/app/agent/sandbox_backend.py`)¶

Circuit Breaker (`server/app/execution/circuit_breaker.py`)¶