Extending Agents¶
Cognition uses a convention-over-configuration model. Most extensions require zero code changes — drop a file in the right directory and the server picks it up automatically (via the file watcher). More powerful extensions require Python.
| Level | Mechanism | Code Required | Hot-Reload |
|---|---|---|---|
| Memory | AGENTS.md |
No | Yes |
| Skills | .cognition/skills/ SKILL.md files |
No | Yes |
| Agents | .cognition/agents/ YAML or Markdown |
No | Yes |
| Tools | Python functions | Yes | Yes |
| Middleware | Python classes | Yes | No |
| MCP servers | Remote HTTP/SSE endpoints | No | Yes |
| A2A exposure | a2a_exposed: true on agent definition |
No | Yes |
| Custom LLM providers | Python factories | Yes | No |
1. Memory (AGENTS.md)¶
Place an AGENTS.md file in your project root. It is automatically injected into the agent's system prompt for every session in that project.
Use memory for: - Project-specific rules and conventions - Architecture decisions - Code style guidelines - Workflow instructions
# My Project
This is a Django REST API. All models live in `myapp/models/`.
Use Python 3.11 type hints everywhere. Tests run with pytest.
The database is PostgreSQL — never use SQLite in tests.
## Conventions
- Prefer `select_related` over multiple queries
- All API views inherit from `BaseAPIView`
- Migrations must be reviewed before merging
Configure which memory files to load in .cognition/config.yaml:
2. Skills (SKILL.md)¶
Skills are modular instruction sets for domain-specific tasks. The agent sees a skill's name and description and loads the full content only when it is relevant to the current task (progressive disclosure).
Directory Structure¶
.cognition/skills/
deploy-app/
SKILL.md # instructions for deploying the application
references/ # optional supporting files
checklist.md
run-migrations/
SKILL.md
SKILL.md Format¶
# Deploy App
Use this skill when the user asks to deploy the application or push changes to production.
## Prerequisites
- Docker must be running
- AWS credentials must be configured
## Steps
1. Run the test suite: `uv run pytest`
2. Build the Docker image: `docker build -t myapp:latest .`
3. Push to ECR: `docker push <account>.dkr.ecr.us-east-1.amazonaws.com/myapp:latest`
4. Update the ECS service: `aws ecs update-service --cluster prod --service myapp --force-new-deployment`
Attach skills by registry name (seeded from skill_sources directories):
3. Custom Agents¶
Place agent definitions in .cognition/agents/ as Markdown or YAML files. The file watcher reloads them automatically on change.
Markdown Format¶
The filename (without extension) becomes the agent name. The YAML frontmatter provides fields; the Markdown body becomes the system prompt.
---
# .cognition/agents/security-auditor.md
mode: subagent
description: Audits code for security vulnerabilities and reports findings with severity ratings
tools:
- "run_semgrep"
config:
model: gpt-4o
temperature: 0.1
---
You are a security expert specialising in Python web applications.
When asked to audit code:
1. Check for SQL injection, XSS, CSRF, and path traversal vulnerabilities
2. Review dependency versions for known CVEs
3. Report findings with severity (Critical/High/Medium/Low) and remediation steps
YAML Format¶
# .cognition/agents/data-analyst.yaml
name: data-analyst
mode: primary
description: Analyses datasets and generates statistical reports
system_prompt: |
You are a data analyst. Use pandas and matplotlib for analysis.
Always validate data quality before drawing conclusions.
tools:
- "load_csv"
- "plot_chart"
config:
model: gpt-4o
temperature: 0.2
Agent Modes¶
| Mode | Can own a session | Can be delegated to |
|---|---|---|
primary |
Yes | No |
subagent |
No | Yes |
all |
Yes | Yes |
Sessions are created with agent_name:
curl -X POST http://localhost:8000/sessions \
-H "Content-Type: application/json" \
-d '{"agent_name": "data-analyst"}'
Primary agents can delegate to subagents via the task tool. The delegation appears as a delegation SSE event.
4. Custom Tools¶
Tools are Python callables that the agent can invoke. Cognition converts them to LangChain tools automatically.
Simple Function Tool¶
# myapp/tools/analysis.py
import subprocess
def run_linter(file_path: str) -> str:
"""Run ruff linter on a Python file and return the findings.
Args:
file_path: Path to the Python file to lint.
Returns:
Linter output as a string.
"""
result = subprocess.run(
["ruff", "check", file_path],
capture_output=True,
text=True,
)
return result.stdout or "No issues found."
The docstring becomes the tool description shown to the agent. Type annotations become the argument schema.
Register via Config¶
# .cognition/config.yaml
skill_sources:
- .cognition/skills/
tool_sources:
- .cognition/tools/
agent:
tools:
- "run_linter"
- "query_database"
Auto-Discovery¶
Drop Python files into .cognition/tools/ and they are discovered automatically. Each public function in the file becomes a tool. The file watcher reloads them on change.
# .cognition/tools/my_tools.py
def fetch_ticket(ticket_id: str) -> str:
"""Fetch a Jira ticket by ID and return its summary and status."""
...
def post_comment(ticket_id: str, comment: str) -> str:
"""Post a comment to a Jira ticket."""
...
Register via API (Source-in-DB)¶
When Cognition runs in a separate container from your builder application, you cannot write files into .cognition/tools/. Use the REST API to register tools with inline Python source code instead:
curl -X POST http://localhost:8000/tools \
-H "Content-Type: application/json" \
-d '{
"name": "search-jira",
"code": "from langchain_core.tools import tool\nimport httpx\n\n@tool\ndef search_jira(query: str) -> str:\n \"\"\"Search Jira issues by query string.\"\"\"\n resp = httpx.get(f\"https://jira.example.com/search?q={query}\")\n return resp.text"
}'
The tool is stored in the ConfigRegistry (Postgres or SQLite) and loaded on every agent invocation — no restart required.
# The code field contains a complete Python module as a string.
# @tool-decorated functions and BaseTool subclasses are extracted automatically.
code = """
from langchain_core.tools import tool
@tool
def search_jira(query: str) -> str:
\"\"\"Search Jira issues by query string.\"\"\"
import httpx
resp = httpx.get(f"https://jira.example.com/search?q={query}")
return resp.text
"""
import httpx
httpx.post("http://localhost:8000/tools", json={"name": "search-jira", "code": code})
Security: Tool code executes with full Python privileges inside the sandbox backend. Restrict
POST /toolsto authorized administrators at your Gateway/proxy layer.
Alternatively, register by module path if the module is already importable in the server's Python environment:
curl -X POST http://localhost:8000/tools \
-H "Content-Type: application/json" \
-d '{"name": "jira-tools", "path": "mycompany.cognition_tools.jira"}'
To see all registered tools (both file-discovered and API-registered):
Response includes a source_type field: "file" for auto-discovered tools, "api_code" for source-in-DB tools, and "api_path" for module-path tools.
Async Tools¶
Async functions are supported natively:
async def call_api(endpoint: str, payload: dict) -> str:
"""Call an internal API endpoint with a JSON payload."""
async with httpx.AsyncClient() as client:
response = await client.post(endpoint, json=payload)
return response.text
Programmatic Registration¶
from server.app.agent.cognition_agent import create_cognition_agent
from server.app.agent.definition import AgentDefinition
definition = AgentDefinition(
name="my-agent",
system_prompt="You are a helpful assistant.",
tools=["run_linter"],
)
agent = await create_cognition_agent(definition, settings)
Testing Tools¶
# tests/unit/test_my_tools.py
from myapp.tools.analysis import run_linter
from unittest.mock import patch, MagicMock
def test_run_linter_clean_file():
with patch("subprocess.run") as mock_run:
mock_run.return_value = MagicMock(stdout="", returncode=0)
result = run_linter("clean.py")
assert result == "No issues found."
def test_run_linter_with_issues():
with patch("subprocess.run") as mock_run:
mock_run.return_value = MagicMock(
stdout="clean.py:1:1: E501 Line too long", returncode=1
)
result = run_linter("messy.py")
assert "E501" in result
5. Middleware¶
Middleware intercepts the agent's processing loop. Use middleware for cross-cutting concerns: approval gates, custom telemetry, PII detection, retry logic.
Upstream Middleware (No Code)¶
Four upstream middleware components are available by name in agent.middleware:
agent:
middleware:
# Retry failed tool calls with exponential backoff
- name: tool_retry
max_retries: 3
backoff_factor: 2.0
# Hard cap on total tool invocations
- name: tool_call_limit
run_limit: 50
per_tool_limits:
execute_bash: 10
# Detect and redact PII before sending to the LLM
- name: pii
pii_types:
- email
- phone
- credit_card
- ip
- ssn
strategy: redact # or "mask"
# Require human approval before specific tools execute
- name: human_in_the_loop
approve_tools:
- execute_bash
- file_write
Custom Middleware¶
Implement deepagents.middleware.AgentMiddleware and register it in .cognition/config.yaml as a dotted import path.
# myapp/middleware/audit.py
from deepagents.middleware import AgentMiddleware
from myapp.audit_log import write_audit_event
class AuditMiddleware(AgentMiddleware):
"""Writes every tool call to an immutable audit log."""
async def awrap_tool_call(self, tool_call, handler):
# Called before the tool executes
write_audit_event(
event_type="tool_call",
tool=tool_call.name,
args=tool_call.args,
)
result = await handler(tool_call)
# Called after the tool executes
write_audit_event(
event_type="tool_result",
tool=tool_call.name,
exit_code=result.exit_code,
)
return result
Register in config:
String entries are imported directly; dict entries with a name key are treated as upstream middleware.
6. MCP Tool Servers¶
Connect to any remote Model Context Protocol (MCP) server. MCP servers expose tools over HTTP.
# .cognition/config.yaml
mcp:
servers:
- name: github-tools
url: https://mcp.github.example.com/sse
transport: sse # or "streamable_http"
- name: internal-db
url: http://db-tools.internal:8080/sse
All tools exposed by the MCP server become available to the agent under the server name as a namespace prefix (e.g. github-tools/create_pr). The tool_name_prefix=True setting on MultiServerMCPClient prevents tool name collisions between servers.
How It Works¶
Cognition uses langchain-mcp-adapters to connect to MCP servers. The adapter:
- Connects to each configured remote server using SSE or Streamable HTTP transport
- Converts MCP tools into LangChain
BaseToolinstances - Applies a scope injection interceptor that adds
X-Cognition-Scope-*headers to every MCP request - Registers progress and logging callbacks for observability
- Returns tools that participate in the full Deep Agents middleware stack (tool safety, HITL, permissions)
Transport Options¶
| Transport | Description |
|---|---|
sse |
Server-Sent Events (default). Best for long-lived connections. |
streamable_http |
HTTP with streaming. Best for serverless or short-lived connections. |
Managing MCP Servers via API¶
MCP servers can also be managed at runtime via the REST API:
# List registered servers
curl http://localhost:8000/mcp-servers
# Register a new server
curl -X POST http://localhost:8000/mcp-servers \
-H "Content-Type: application/json" \
-d '{"name": "my-tools", "url": "https://tools.example.com/sse", "transport": "sse"}'
# Update a server
curl -X PATCH http://localhost:8000/mcp-servers/my-tools \
-H "Content-Type: application/json" \
-d '{"enabled": false}'
# Delete a server
curl -X DELETE http://localhost:8000/mcp-servers/my-tools
Note: MCP server
headers(containing credentials) are redacted in API responses —GET /mcp-serversreturns an emptyheadersdict to prevent credential leakage. File-managed servers (from.cognition/config.yaml) are read-only via the API — mutation attempts return409 Conflict.
Only HTTP/HTTPS URLs are accepted — stdio-based MCP servers are not supported for security reasons.
7. Exposing Agents via A2A¶
Cognition can expose agents via the Agent-to-Agent (A2A) protocol, allowing external systems to discover and invoke your agents.
Opting In¶
Set a2a_exposed: true on any agent definition:
# .cognition/agents/deploy-agent.yaml
name: deploy-agent
mode: primary
a2a_exposed: true
description: Handles deployment workflows
system_prompt: |
You are a deployment agent. Deploy applications safely and report results.
Or via the API:
curl -X POST http://localhost:8000/agents \
-H "Content-Type: application/json" \
-d '{"name": "deploy-agent", "system_prompt": "...", "a2a_exposed": true}'
How It Works¶
- Agent card discovery —
GET /a2a/{agent_name}/.well-known/agent-card.jsonreturns the A2AAgentCardfor that agent.GET /.well-known/agent-card.jsonalso lists exposed agents visible to the caller's scope. - JSON-RPC endpoint — Each agent gets a dedicated endpoint at
POST /a2a/{agent_name}. The agent is resolved at request time, so agents created after server startup are immediately available. - Scope-aware — A2A requests must include
X-Cognition-Scope-*headers. Only agents visible in the caller's scope are discoverable and invocable. - Bridging — The
CognitionA2AExecutorbridges A2A requests to Cognition'sservice.stream_response(), reusing the full agent runtime, tools, middleware, and persistence.
A2A Client Example¶
import httpx
# Discover a specific agent card from the agent's canonical A2A URL
resp = httpx.get(
"http://localhost:8000/a2a/deploy-agent/.well-known/agent-card.json",
headers={"X-Cognition-Scope-User": "alice"},
)
card = resp.json()
# Send a message to an agent
resp = httpx.post(
"http://localhost:8000/a2a/deploy-agent",
json={
"jsonrpc": "2.0",
"id": "1",
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "Deploy staging"}],
}
},
},
headers={
"X-Cognition-Scope-User": "alice",
},
)
Constraints¶
- Built-in agents (
default,readonly) havea2a_exposed=Falseby default - Only
primaryandallmode agents can be exposed via A2A - If
A2A-Versionis omitted, Cognition treats the request as the current supported A2A version - Dynamically registered agents are scope-bound. Use the same
X-Cognition-Scope-*headers when creating, discovering, and invoking an agent. - A2A does not add any additional services — endpoints are part of the main Cognition server
For full A2A protocol details, see the A2A SDK documentation.
8. Custom LLM Providers¶
Cognition uses LangChain's init_chat_model() under the hood, which supports any provider that has a LangChain integration. The built-in provider types are:
| Type | LangChain Package | Credentials |
|---|---|---|
openai |
langchain-openai |
OPENAI_API_KEY |
anthropic |
langchain-anthropic |
ANTHROPIC_API_KEY |
bedrock |
langchain-aws |
AWS IAM credentials |
google_genai |
langchain-google-genai |
GOOGLE_API_KEY |
google_vertexai |
langchain-google-vertexai |
Google ADC |
openai_compatible |
langchain-openai + custom base_url |
COGNITION_OPENAI_COMPATIBLE_API_KEY |
To add a provider, create a ProviderConfig entry via the REST API:
curl -X POST http://localhost:8000/models/providers \
-H "Content-Type: application/json" \
-d '{
"id": "my-provider",
"provider": "openai_compatible",
"model": "my-model",
"base_url": "https://my-provider.example.com/v1",
"api_key_env": "MY_PROVIDER_API_KEY",
"enabled": true,
"priority": 0
}'
Or define it in .cognition/config.yaml (bootstrapped on first startup):
llm:
provider: openai_compatible
model: my-model
base_url: https://my-provider.example.com/v1
api_key_env: MY_PROVIDER_API_KEY
Test connectivity:
For providers not supported by init_chat_model, wrap them in a LangChain BaseChatModel and use openai_compatible with a local proxy, or contribute a LangChain integration upstream.
Hot-Reload¶
The file watcher (server/app/file_watcher.py) monitors .cognition/tools/, .cognition/middleware/, and .cognition/agents/ using watchdog. When any file in these directories changes:
- Tool registry is reloaded (new tools available, removed tools gone)
- Agent definition registry is reloaded (new/updated agents loaded)
- Agent cache is invalidated so the next session uses the updated definition
No server restart required. Changes typically take effect within 1 second.