Open source (MIT)· Managed cloud·Cloudflare edge

Shared context for AI agents

Push large payloads to the edge. Agents on different LLMs pull them by URL. Pub/sub channels coordinate them in ~20 ms.

Open-source (MIT) or managed cloud — same SDK either way.

// without ContextRelay
Agent A[paste 50 KB into prompt]Agent B ≈ 12,500 tokens burned
// with ContextRelay
Agent A → push() → "https://.../pull/uuid"Agent B → pull(url)
                            ~80 chars · ~0 tokens · 75 ms

Two ways to run it

Same SDK, same protocol, same code. Pick the deployment that fits.

Self-host (open source)

Clone the repo, deploy the Worker to your own Cloudflare account. You own the data, the keys, the operational surface — and pay only for Cloudflare usage. The free tier covers 100K requests/day.

  • MIT license, no vendor lock-in
  • Runs on your Cloudflare account
  • No API keys, no quota
  • You handle ops
$ git clone github.com/cmhashim/ContextRelay
$ cd ContextRelay/api && wrangler deploy
Self-host guide on GitHub →

Managed cloud

We run the Worker for you on the global edge. You get an API key, a dashboard, metered usage, quotas, and zero ops. Same SDK as the open-source path — just point it at the managed URL.

  • Free tier: 1K pushes / 10K pulls per month
  • Dashboard for keys and usage
  • Quota enforcement and billing
  • We handle ops
$ pip install contextrelay
$ export CONTEXTRELAY_API_KEY=cr_live_...
Get a free API key →
pip install contextrelay
·pypi.org/project/contextrelay ↗

Why now

Multi-agent stacks are exiting the demo phase. Frameworks like CrewAI, LangGraph, and AutoGen ship production fleets. Devs wire Claude, GPT, Mistral, and Gemini into the same pipeline. The bottleneck used to be reasoning — it's now moving bytes between agents.

Every framework reinvents this layer with custom Redis schemas, scratch files, or 50 KB-per-prompt token taxes. ContextRelay is the layer underneath: ephemeral storage, pub/sub channels, optional E2EE — provider-agnostic and open-source.

How it works

Three lines replace a token tax. Works across any agent, any LLM provider, any runtime.

01
Push
Agent A uploads the payload to the Cloudflare edge. Gets back a short URL pointer. Takes ~250 ms for 125 KB.
02
Relay the URL
The orchestrator passes the URL to Agent B — not the data. The URL is ~80 characters; the payload stays at the edge.
03
Pull on demand
Agent B fetches the full payload directly from the URL. ~75 ms for 125 KB. Works from any runtime — Python, Node, curl.

Five things you can build with it

Five different shapes of multi-agent work. Pick the one that looks like yours.

Live demo

Claude plans. Mistral builds. Channels coordinate them.

Two agents, one channel namespace, no human in the loop. Claude pushes an architecture to task-assigned; Mistral's subscriber fires in ~20 ms, implements the codebase, and publishes back to task-done.

shared API key ·same channels ·autonomous
Claude Opus design → push(arch, channel="task-assigned")
↓ subscriber fires in ~20 ms
Mistral Large pull(url) → implement → push(code, channel="task-done")
↓ architect notified
Claude Opus receives implementation · reviews · done
1
Terminal 1 — Mistral engineer
Subscribes to task-assigned · waits for Claude
mistral_engineer.py
# mistral_engineer.py — run this first in a terminal
import os
from mistralai import Mistral
from contextrelay import ContextRelay

relay   = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
mistral = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

def on_task(url: str):
    architecture = relay.pull(url)
    code = mistral.chat.complete(
        model="mistral-large-latest",
        messages=[{"role": "user", "content":
            "Implement this architecture as runnable code:\n\n" + architecture}],
    ).choices[0].message.content
    relay.push(code, channel="task-done", metadata={"role": "implementation"})

relay.subscribe("task-assigned", on_task)  # blocks, waits for Claude
2
Claude Code — paste this prompt
Pushes to task-assigned · Mistral fires automatically
Claude Code prompt
You are a senior software architect in an autonomous pipeline.
Your engineer (Mistral) is subscribed to channel "task-assigned".
The moment you push your architecture there, Mistral implements it.

Task: Design a production-ready FastAPI task management API.
Cover: User + Task models, all CRUD endpoints, JWT auth,
SQLAlchemy + SQLite, Pydantic schemas, file structure.

Steps:
1. Write the complete architecture document
2. Use push_context — set channel="task-assigned"
3. Subscribe / poll "task-done" to receive the implementation back

Start now. The pipeline is fully autonomous.
[architect] Architecture pushed → task-assigned
[engineer] Task received — implementing...
[engineer] Implementation published → task-done
[architect] Implementation received ✓

Drop-in for your agent framework

First-class integrations for the three biggest multi-agent frameworks. Same SDK underneath — pick the binding that matches your stack.

LangChain
pip install contextrelay[langchain]
from contextrelay.integrations.langchain import (
    ContextRelayRetriever,
)

retriever = ContextRelayRetriever(
    api_key=API_KEY, channel="research"
)
docs = retriever.get_relevant_documents("...")
CrewAI
pip install contextrelay[crewai]
from contextrelay.integrations.crewai import (
    ContextRelayPushTool, ContextRelayPullTool,
)

agent = Agent(
    role="Researcher",
    tools=[ContextRelayPushTool(api_key=API_KEY)],
)
AutoGen
pip install contextrelay[autogen]
from contextrelay.integrations.autogen import (
    get_autogen_tools,
)

tools = get_autogen_tools(api_key=API_KEY)
# pass to your AssistantAgent / FunctionTool list

What you get

Sub-100 ms pulls
Cloudflare Workers + KV across 300+ PoPs. Your context lives milliseconds from wherever your agents run.
📡
WebSocket pub/sub
Subscribe a channel, receive pointer URLs in ~20 ms when any agent pushes. No polling, no callbacks to manage.
🔍
Peek before you pull
Read metadata — size, tags, TTL — without downloading the payload. Route or skip without touching your token budget.
🔗
Any LLM, any provider
Claude, GPT-4, Mistral, Gemini — the URL is just a string. Pass it anywhere without touching the payload.
🛠️
MCP native tools
Register as an MCP server. Claude Desktop and Claude Code get push_context, peek_context, pull_context natively.
📖
Open source
MIT-licensed. Clone the repo, deploy to your Cloudflare account, run it forever. No vendor lock-in, no telemetry.

How it compares

Honest comparison. ContextRelay is the cross-provider option. For single-LLM use cases, the platform-native memory is fine — when you span providers or need pub/sub, you need something different.

ContextRelayAnthropic MemoryOpenAI AssistantsMem0 / LettaRedis pub/sub
Best forCross-provider handoff & coordinationSingle-user Claude conversationsOpenAI Assistants onlySemantic recall over user historyReal-time messaging in your stack
Cross-LLMYesNo (Claude-only)No (OpenAI-only)YesYes
Multi-agent pub/subYes (~20 ms fan-out)NoNoNoYes
Self-hostYes (one wrangler deploy)NoNoLetta yes / Mem0 cloudYes
Client-side E2EEYes (fragment-key)NoNoNoTLS only
Default lifetimeEphemeral (24 h TTL)PersistentPersistentPersistentIn-memory
Hosted priceFree / $29 / $99Per-tokenPer-tokenMem0 / Letta cloudMany vendors
Use the right tool. If you need persistent semantic recall, use Mem0 or Letta. If you only call one LLM, the platform's built-in memory is fine. ContextRelay's sweet spot is the moment you have agents on different providers, a 50 KB payload, and want it to cost ~20 tokens — with optional encryption and channel coordination on top.

Simple pricing

Free to start. No credit card. Or self-host for free, forever.

Without ContextRelay: 50 KB context × 1,000 handoffs/day ≈ $15/day in token tax (~$450/month).
With ContextRelay Pro: $29/month covers 100,000 handoffs.
Free
$0/mo
  • 1,000 pushes / mo
  • 10,000 pulls / mo
  • 2 API keys
  • Fits a 2-agent prototype
Start free
Pro
$29/mo
  • 100,000 pushes / mo
  • 1M pulls / mo
  • 10 API keys
  • Fits a CrewAI fleet 8 hrs/day
Get Pro
Team
$99/mo
  • 1M pushes / mo
  • 10M pulls / mo
  • 100 API keys
  • Fits a multi-tenant agent platform
Get Team

Pick your path

Self-host on your Cloudflare account, or let us run it for you. Same SDK either way.