How to wire AI agents together across providers, frameworks, and runtimes — without burning tokens on data transit.
Every agent in your pipeline has a token budget. When Agent A produces a large output — a code review, a research report, a database dump — and Agent B needs it, you have two options: paste the full text into B's prompt (expensive) or truncate it (lossy).
ContextRelay gives you a third option. Agent A pushes the payload to the edge and gets back an 80-character URL. Agent B receives the URL, then pulls the full payload directly — it never appears in A's conversation again. For sensitive payloads, add encrypted=True and the data is encrypted in your process before it leaves your machine — Cloudflare only ever sees ciphertext.
pip install contextrelaySign up and create an API key — copy the cr_live_... value and store it in an env var.
export CONTEXTRELAY_API_KEY="cr_live_..."import os
from contextrelay import ContextRelay
# No base_url needed — defaults to the managed cloud
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
url = relay.push("any large text, JSON, or Markdown — up to 25 MB")
data = relay.pull(url) # retrieve from any agent, any machineSelf-hosting? Point base_url at your Worker — same SDK, same protocol.
Swap models mid-pipeline without losing context. Claude does a thorough code review; the review is too large to fit alongside the follow-up instructions in Mistral's context. ContextRelay bridges them.
import os, anthropic
from mistralai import Mistral
from contextrelay import ContextRelay
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
claude = anthropic.Anthropic()
mistral = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
# ── Agent A: Claude reviews the PR ──────────────────────────────
diff = open("pr_diff.txt").read() # ~20 KB of git diff
review = claude.messages.create(
model="claude-opus-4-5",
max_tokens=4096,
messages=[{"role": "user", "content": f"Code review:\n\n{diff}"}],
).content[0].text
# Store the review — hand off only the URL
review_url = relay.push(review, metadata={"pr": "PR-441", "type": "review"})
# ── Agent B: Mistral turns the review into Jira tickets ─────────
full_review = relay.pull(review_url) # fetched directly by Mistral
tickets = mistral.chat.complete(
model="mistral-large-latest",
messages=[{
"role": "user",
"content": f"Convert this review into Jira tickets:\n\n{full_review}"
}],
).choices[0].message.contentA specialist crew — researcher, writer, fact-checker, editor — shares one channel namespace. No custom Redis schema, no scratch files. The first-class integrations handle the wire format; you write the agents.
from crewai import Agent, Task, Crew
from contextrelay.integrations.crewai import (
ContextRelayPushTool, ContextRelayPullTool,
)
push = ContextRelayPushTool(api_key=API_KEY, channel="research-pipeline")
pull = ContextRelayPullTool(api_key=API_KEY)
researcher = Agent(role="Researcher",
goal="Find sources on quantum computing", tools=[push])
writer = Agent(role="Writer",
goal="Draft article from research", tools=[pull, push])
checker = Agent(role="Fact-checker",
goal="Verify each claim", tools=[pull])
crew = Crew(agents=[researcher, writer, checker],
tasks=[...])
crew.kickoff()from contextrelay import ContextRelay
relay = ContextRelay(api_key=API_KEY)
def researcher_node(state):
findings = run_llm("research the topic")
url = relay.push(findings, channel="research")
return {"research_url": url}
def writer_node(state):
findings = relay.pull(state["research_url"])
return {"draft": run_llm(f"Write from: {findings}")}
# Add as nodes in your StateGraph — payloads stay at the edgefrom autogen_agentchat.agents import AssistantAgent
from contextrelay.integrations.autogen import get_autogen_tools
tools = get_autogen_tools(api_key=API_KEY)
researcher = AssistantAgent(
name="researcher",
tools=tools,
system_message="Push findings to ContextRelay, share the URL.",
)Fire-and-forget agent tasks. The orchestrator pushes a task to a channel and moves on. A worker subscribed to the channel wakes up, processes, and pushes the result to a done-channel. Replaces polling, queues, and callback hell.
import os
from contextrelay import ContextRelay
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
# Fire a long-running research task
task_url = relay.push(
"Analyse the last 30 days of customer support tickets for sentiment trends",
channel="bg-tasks",
metadata={"task_id": "T-1042", "priority": "low"},
)
# Orchestrator returns immediately — does NOT block on the workerimport threading
from contextrelay import ContextRelay
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
def on_task(url):
task = relay.pull(url)
meta = relay.peek(url) # cheap — no payload download
result = run_long_analysis(task) # could take minutes
relay.push(
result,
channel="bg-done",
metadata={"task_id": meta["task_id"], "duration_s": 142},
)
threading.Thread(
target=relay.subscribe,
args=("bg-tasks", on_task),
daemon=True,
).start()def on_done(url):
meta = relay.peek(url)
print(f"Task {meta['task_id']} done in {meta['duration_s']}s")
result = relay.pull(url)
notify_user(meta["task_id"], result)
relay.subscribe("bg-done", on_done) # blocking — run in a threadA Claude Code or Cursor instance running on your laptop talks to production agents running in a Cloudflare Worker. Same SDK, same channels, no bridge code. Use the local agent for exploratory work; the cloud agents pick up from where you left off.
# Inside Claude Code, with the contextrelay MCP server registered:
# (~/.claude/mcp.json)
# Claude can call push_context as a native tool:
push_context(
data="<full stack trace + repro steps>",
channel="prod-issues",
metadata={"severity": "high", "service": "api-gateway"},
)
# Returns: https://api.contextrelay.dev/pull/<uuid>// Cloudflare Worker — listens on prod-issues
import { ContextRelay } from "contextrelay-js";
export default {
async fetch(req, env) {
const relay = new ContextRelay({ apiKey: env.CR_KEY });
return relay.subscribe("prod-issues", async (url) => {
const meta = await relay.peek(url);
if (meta.severity === "high") {
const trace = await relay.pull(url);
await pageOnCall(meta.service, trace);
}
});
},
};encrypted=True, the payload is encrypted in your process before it leaves your machine. Cloudflare — and ContextRelay — only ever store ciphertext. The decryption key is embedded in the URL fragment (#key=…) and is never sent to any server (RFC 3986 guarantees fragments stay client-side).Encryption is opt-in per push. Use it for any payload containing PII, credentials, or proprietary data. Metadata stays plaintext — so your audit logs can record who pushed what without decrypting payloads.
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
# Encrypt on push — a fresh AES key is generated locally
url = relay.push(
customer_pii_record,
encrypted=True,
metadata={"customer_id": "C-9842", "purpose": "compliance-review"},
)
# url → "https://.../pull/<uuid>#key=<base64-fernet-key>"
# Anyone with the full URL can decrypt; anyone without #key= cannot
result = relay.pull(url) # decrypted locally, never on the server| Field | Encrypted? | Notes |
|---|---|---|
| data | Yes | Fernet (AES-128-CBC + HMAC-SHA256) |
| metadata | No | Always plaintext — don't put secrets here |
| key | Never leaves client | URL fragment — never transmitted to server |
pip install contextrelay[crypto] # or: pip install cryptographyClaude has the push_context MCP tool available. It will design the API and push the architecture to ContextRelay automatically.
You are a senior software architect. Design a production-ready FastAPI task management API.
Your design must cover:
- Data models: User, Task (with status, priority, due_date)
- All REST endpoints: auth (register/login/me), tasks (CRUD + filter by status)
- JWT authentication flow
- SQLite + SQLAlchemy ORM setup
- Pydantic schemas for request/response validation
- File structure and key implementation decisions
Write the complete architecture document with precise details so an engineer
can implement without asking questions.
When done, use the push_context tool to save the full document to ContextRelay.
Print the returned URL clearly — your engineer (Mistral) will build the entire codebase from it.Replace PASTE_URL_FROM_CLAUDE_HERE and your API key. Mistral fetches the architecture and implements the full codebase.
You are a senior Python engineer. Your architect (Claude Opus) has designed a FastAPI API.
Step 1 — fetch the architecture from ContextRelay:
import requests
plan = requests.get(
"PASTE_URL_FROM_CLAUDE_HERE",
headers={"Authorization": "Bearer YOUR_API_KEY"}
).text
print(plan)
Step 2 — read the architecture and implement the complete codebase:
- Every Python file described in the design
- requirements.txt
- README.md with setup and run instructions
Rules:
- Match the architect's design exactly — do not improvise
- Output complete, runnable code only
- No placeholders, no TODOsimport os, anthropic
from mistralai import Mistral
from contextrelay import ContextRelay
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
claude = anthropic.Anthropic()
mistral = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
# ── Claude Opus: architect ───────────────────────────────────────
arch_response = claude.messages.create(
model="claude-opus-4-5",
max_tokens=8192,
messages=[{
"role": "user",
"content": (
"Design a production FastAPI task management API. Include data models, "
"all endpoints, JWT auth, SQLAlchemy setup, and file structure. "
"Be complete — an engineer will implement directly from this document."
)
}],
).content[0].text
# Push architecture — hand off just the URL
arch_url = relay.push(arch_response, metadata={"role": "architecture", "project": "task-api"})
# ── Mistral Large: engineer ──────────────────────────────────────
architecture = relay.pull(arch_url) # Mistral fetches directly — 0 tokens in Claude
code_response = mistral.chat.complete(
model="mistral-large-latest",
messages=[{
"role": "user",
"content": (
f"You are a senior Python engineer. Implement this architecture as a complete, "
f"runnable codebase. Every file. No placeholders.\n\nArchitecture:\n{architecture}"
)
}],
).choices[0].message.content
# Push implementation — share the URL with your team or CI
impl_url = relay.push(code_response, metadata={"role": "implementation", "project": "task-api"})push_and_wait blocks an orchestrator script until a Claude Code instance running in a tmux window finishes a task — no polling, no SSH, no manual copy-paste.
# terminal in your tmux session named "vibe", window 0
pip install contextrelay
contextrelay-bridge start --tmux vibe --task-channel vibe-tasks --done-channel vibe-doneimport os
from contextrelay import ContextRelay, AgentBridge
relay = ContextRelay(api_key=os.environ["CONTEXTRELAY_API_KEY"])
bridge = AgentBridge(relay, task_channel="vibe-tasks", done_channel="vibe-done")
result = bridge.push_and_wait(
"Refactor the auth module to use Firebase. "
"Run the type checker. Return a summary of all changed files."
)
print(result) # full Claude Code output, stripped of UI chrome| Method | What it does |
|---|---|
| push(data, ...) | Upload payload (str, up to 25 MB), returns URL. Options: channel, encrypted, metadata. |
| pull(url) | Download payload. Auto-decrypts if URL contains #key=. |
| peek(url) | Fetch metadata only — no payload download. |
| subscribe(ch, fn) | Subscribe to a channel. Calls fn(url) on each push. Blocking — run in a thread. |
| publish(ch, msg) | Publish a message to a channel without a payload push. |
Every SDK method maps to a Worker endpoint. Authenticate with Authorization: Bearer cr_live_....
| Method | Endpoint | Description |
|---|---|---|
| POST | /push | Upload payload → { url, id } |
| GET | /pull/:id | Download payload by ID |
| GET | /peek/:id | Metadata only, no payload |
| GET | /ws/:channel | WebSocket upgrade — pub/sub |
curl -X POST https://api.contextrelay.dev/push \
-H "Authorization: Bearer $CONTEXTRELAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"data": "hello from curl"}'Register ContextRelay as a native MCP server so Claude can push and pull context without leaving the conversation.
// ~/.claude/mcp.json or .mcp.json in project root
{
"mcpServers": {
"contextrelay": {
"command": "contextrelay-mcp",
"env": {
"CONTEXTRELAY_URL": "https://api.contextrelay.dev",
"CONTEXTRELAY_API_KEY": "cr_live_..."
}
}
}
}Available tools: push_context, peek_context, pull_context.
| Plan | Pushes / mo | Pulls / mo |
|---|---|---|
| Free | 1 000 | 10 000 |
| Pro | 100 000 | 1 000 000 |
| Team | 1 000 000 | 10 000 000 |
Max payload: 25 MB · TTL: 24 hours.
Ready to stop paying token tax?