Your AI finally
remembers.
Recallr is a versioned memory graph for conversational agents. No more amnesia. No more stale facts. Just persistent, self-consistent, evolving knowledge — at any scale.
98.9% accuracy.
Best in Class.
Evaluated on LongMemEval across 500 questions spanning six memory task types. Compared against Mem0, Zep, and Supermemory on identical configurations.
Average accuracy
vs Mem0 (best) improvement
Low-Latency accuracy
Knowledge-Update (Agentic)
99.2% vs 48.1% (Mem0) vs 44.4% (Zep)
+51.1pp improvement over the nearest competitor. Recallr knows the difference between when something happened and when you mentioned it. So 'last summer' means last summer, not last Tuesday.
100.0% Agentic · 96.2% Balanced
When facts change, old versions are archived, not erased. Your agent always has the latest truth, and the full history of how it got there.
Fast when you need it. Deep when it matters.
| Strategy | Min | P25 | P50 | P75 | P95 | Max |
|---|---|---|---|---|---|---|
| Recallr (Low-Latency) | 0.234s | 0.265s | 0.298s | 0.338s | 0.396s | 0.750s |
| Recallr (Balanced) | 1.032s | 1.134s | 1.197s | 1.286s | 1.543s | 3.548s |
| Recallr (Agentic) | 5.125s | 6.182s | 6.987s | 7.765s | 8.432s | 20.095s |
| Mem0 (Graph) | 0.697s | 0.734s | 0.945s | 1.987s | 2.578s | 10.458s |
| Mem0 (Non-Graph) | 0.489s | 0.489s | 0.789s | 0.967s | 1.765s | 6.187s |
| Zep | 0.489s | 0.892s | 1.345s | 1.987s | 3.416s | 4.225s |
| Supermemory | 0.392s | 0.845s | 1.298s | 1.876s | 3.214s | 4.242s |
Low-Latency Recall
Real-time voice and chat — answer before the user finishes thinking.
Balanced Recall
General production workloads — thorough retrieval without perceptible delay.
Agentic Recall
Deep reasoning over months of memory — when completeness matters more than speed.
Two loops.
Infinite memory.
Asynchronous Memory Curation
Every conversation leaves a richer graph than it found.
Conversations become structured knowledge — not raw text logs.
Contradictions get flagged, not silently overwritten.
Facts evolve over time — every version preserved, none lost.
You: "I moved to Delhi last month."
→ Previous memory updated.
"Location: Bengaluru" archived.
"Location: Delhi" marked current.
Version chain preserved.
Synchronous Context Retrieval
The right memory, at the right depth, before the LLM responds.
Auto-Recall automatically routes every query to the right strategy.
Retrieval depth adapts to query complexity — automatically.
Graph traversal surfaces connected memories, not just direct matches.
Session summaries provide episodic context when facts aren’t enough.
You: "What restaurants did I like when I lived in Bengaluru?"
→ Location history traversed.
Past preferences recalled across 3 sessions.
Response grounded in memory — not hallucination.
Memory that works like you think.
# Initialize Recallr memory agent
#00e5c8">"color:#a78bfa">from recallr "color:#a78bfa">import MemoryGraph, RecallStrategy
agent = MemoryGraph(user_id=#00e5c8">"maya_01")
# Session 1 — March 2024
agent.ingest(conversation=[
{#00e5c8">"role": "user", "content": "I live ">in Bengaluru"},
])
# → Memory stored: "user lives ">in Bengaluru" [v1]
# Session 2 — August 2024
agent.ingest(conversation=[
{#00e5c8">"role": "user", "content": "I moved to Delhi last week"},
])
# → Version cha"color:#a78bfa">in updated: v1 → v2 [TEMPORAL_CONFLICT resolved]
# → "user now lives ">in Delhi" [CURRENT]
# Query — any time
context = agent.recall(
query=#00e5c8">"Where does the user live?",
strategy=RecallStrategy.AUTO
)
# → Retrieved: "user lives ">in Delhi" [98.9% accuracy]Two lines of code.
Permanent memory.
A dead-simple, drop-in replacement for your existing agents. Route your OpenAI, Anthropic, or Gemini calls through Recallr — no SDK changes, no architecture rewrites, no new abstractions.
Zero refactoring
Your existing code works unchanged.
Any model
OpenAI, Anthropic, Gemini. One proxy.
Instant memory
Every user gets a persistent profile.
Memory for every long-running agent.
Clinical Context Persistence
Patient preferences, medication history, appointment tracking across months of interactions — with conflict detection when contradictory symptoms emerge.
healthcare · longitudinalAdaptive Learning Memory
Track concepts learned, misconceptions corrected, preferred explanation styles — versioned to show learning progression over time.
education · personalizationCodebase & Preference Recall
Remember architectural decisions, preferred patterns, project context across sessions. Never explain your stack twice.
developer tools · productivityCRM-Grade Memory
Customer preferences, past issues, escalation history — retrieved in <300ms even at millions-of-users scale.
enterprise · CRMSee exactly what you save.
We give $5 to free accounts every month.
Session
Recallr Pipeline
LLM Pricing
Naive approach: every session re-sends the entire raw conversation history as context — cost grows quadratically with sessions Recallr: fixed pipeline overhead per session regardless of history size — cost grows linearly
Common questions.
How is Recallr different from a RAG system?
RAG retrieves static document chunks. Recallr maintains a versioned knowledge graph that evolves over time, with conflict resolution and temporal provenance. It knows what changed, when, and why — RAG just knows what was stored.
What happens when conflicting information is detected?
Recallr classifies the conflict type (temporal update, correction, preference change, or contradiction) and applies the appropriate resolution strategy. For ambiguous conflicts, it can prompt user clarification rather than silently overwriting.
Can Recallr work with any LLM?
Yes. The memory layer is model-agnostic. While the default stack uses Claude 3.5 Sonnet for memory generation, any LLM can query the graph via the recall API. Swap models without losing memory.
How does versioning work in practice?
Each memory entity maintains a linked-list version chain. New information creates a new version node with a "supersedes" edge to the previous version, along with dual timestamps (event time and ingestion time) for temporal provenance.
What's the latency overhead?
Zero for ingestion — curation runs asynchronously. For recall: <300ms (Low-Latency), 900-1500ms (Balanced), or 5-8s (Agentic). Choose the strategy that fits your use case.
Is the memory graph queryable historically?
Yes. You can query the graph at any point in time using temporal filters. Retrieve what the system knew at a specific date, trace the evolution of a fact, or audit the full version history of any entity.
Stop building agents
that forget.
Join the waitlist for early API access.
Early access to API docs · No spam