Shared context for AI agents
Push large payloads to the edge. Agents on different LLMs pull them by URL. Pub/sub channels coordinate them in ~20 ms.
Open-source (MIT) or managed cloud — same SDK either way.
Two ways to run it
Same SDK, same protocol, same code. Pick the deployment that fits.
Self-host (open source)
Clone the repo, deploy the Worker to your own Cloudflare account. You own the data, the keys, the operational surface — and pay only for Cloudflare usage. The free tier covers 100K requests/day.
- ✓MIT license, no vendor lock-in
- ✓Runs on your Cloudflare account
- ✓No API keys, no quota
- ✓You handle ops
Managed cloud
We run the Worker for you on the global edge. You get an API key, a dashboard, metered usage, quotas, and zero ops. Same SDK as the open-source path — just point it at the managed URL.
- ✓Free tier: 1K pushes / 10K pulls per month
- ✓Dashboard for keys and usage
- ✓Quota enforcement and billing
- ✓We handle ops
Why now
Multi-agent stacks are exiting the demo phase. Frameworks like CrewAI, LangGraph, and AutoGen ship production fleets. Devs wire Claude, GPT, Mistral, and Gemini into the same pipeline. The bottleneck used to be reasoning — it's now moving bytes between agents.
Every framework reinvents this layer with custom Redis schemas, scratch files, or 50 KB-per-prompt token taxes. ContextRelay is the layer underneath: ephemeral storage, pub/sub channels, optional E2EE — provider-agnostic and open-source.
How it works
Three lines replace a token tax. Works across any agent, any LLM provider, any runtime.
Five things you can build with it
Five different shapes of multi-agent work. Pick the one that looks like yours.
Claude plans. Mistral builds. Channels coordinate them.
Two agents, one channel namespace, no human in the loop. Claude pushes an architecture to task-assigned; Mistral's subscriber fires in ~20 ms, implements the codebase, and publishes back to task-done.
task-assigned · waits for Claude# mistral_engineer.py — run this first in a terminal
import os
from mistralai import Mistral
from contextrelay import ContextRelay
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
mistral = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
def on_task(url: str):
architecture = relay.pull(url)
code = mistral.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content":
"Implement this architecture as runnable code:\n\n" + architecture}],
).choices[0].message.content
relay.push(code, channel="task-done", metadata={"role": "implementation"})
relay.subscribe("task-assigned", on_task) # blocks, waits for Claudetask-assigned · Mistral fires automaticallyYou are a senior software architect in an autonomous pipeline. Your engineer (Mistral) is subscribed to channel "task-assigned". The moment you push your architecture there, Mistral implements it. Task: Design a production-ready FastAPI task management API. Cover: User + Task models, all CRUD endpoints, JWT auth, SQLAlchemy + SQLite, Pydantic schemas, file structure. Steps: 1. Write the complete architecture document 2. Use push_context — set channel="task-assigned" 3. Subscribe / poll "task-done" to receive the implementation back Start now. The pipeline is fully autonomous.
Drop-in for your agent framework
First-class integrations for the three biggest multi-agent frameworks. Same SDK underneath — pick the binding that matches your stack.
from contextrelay.integrations.langchain import (
ContextRelayRetriever,
)
retriever = ContextRelayRetriever(
api_key=API_KEY, channel="research"
)
docs = retriever.get_relevant_documents("...")from contextrelay.integrations.crewai import (
ContextRelayPushTool, ContextRelayPullTool,
)
agent = Agent(
role="Researcher",
tools=[ContextRelayPushTool(api_key=API_KEY)],
)from contextrelay.integrations.autogen import (
get_autogen_tools,
)
tools = get_autogen_tools(api_key=API_KEY)
# pass to your AssistantAgent / FunctionTool listWhat you get
How it compares
Honest comparison. ContextRelay is the cross-provider option. For single-LLM use cases, the platform-native memory is fine — when you span providers or need pub/sub, you need something different.
| ContextRelay | Anthropic Memory | OpenAI Assistants | Mem0 / Letta | Redis pub/sub | |
|---|---|---|---|---|---|
| Best for | Cross-provider handoff & coordination | Single-user Claude conversations | OpenAI Assistants only | Semantic recall over user history | Real-time messaging in your stack |
| Cross-LLM | Yes | No (Claude-only) | No (OpenAI-only) | Yes | Yes |
| Multi-agent pub/sub | Yes (~20 ms fan-out) | No | No | No | Yes |
| Self-host | Yes (one wrangler deploy) | No | No | Letta yes / Mem0 cloud | Yes |
| Client-side E2EE | Yes (fragment-key) | No | No | No | TLS only |
| Default lifetime | Ephemeral (24 h TTL) | Persistent | Persistent | Persistent | In-memory |
| Hosted price | Free / $29 / $99 | Per-token | Per-token | Mem0 / Letta cloud | Many vendors |
Simple pricing
Free to start. No credit card. Or self-host for free, forever.
Pick your path
Self-host on your Cloudflare account, or let us run it for you. Same SDK either way.