Mem0 vs Zep vs Letta: Agent Memory Compared (2026)

Q: What is the best AI agent memory platform in 2026?

For a default production choice, Mem0 offers the largest ecosystem, AWS Agent SDK partnership, and quickest integration path. Zep is the better engine when temporal fact evolution is core to the product. Letta is the right foundation when you want agents to manage their own memory tiers inside a dedicated runtime.

Q: Is Letta the same as MemGPT?

Yes. Letta is the current name for the project originally published as MemGPT, based on the peer-reviewed OS-inspired memory hierarchy paper. The MemGPT paging metaphor (core, recall, archival) survives in Letta's architecture.

Q: How do LoCoMo and LongMemEval compare memory systems?

LoCoMo (1,540 questions) stresses multi-session conversational recall across four categories. LongMemEval (500 questions) adds knowledge updates, preference tracking, and harder temporal reasoning. Together they replaced ad hoc demos, though both focus on chat histories rather than task-execution agents.

Q: Can I self-host Mem0, Zep, and Letta?

Mem0 and Letta offer full open-source self-host paths. Zep Cloud is managed; self-hosting means running Graphiti directly without deprecated Zep Community Edition. All three support local development without cloud dependency when you bring your own vector or graph database.

Mem0 vs Zep (Graphiti) comes down to one rule: choose Mem0 for fast vector-first personalization you bolt onto any agent stack, and choose Zep when facts change over time and "what was true in Q1?" matters, since Graphiti stores validity windows on every fact. Letta is the third option: a full agent runtime where the model pages its own memory. Vendor recall scores disagree, so verify recall on your own workload before committing.

Key takeaways

Mem0 leads on ecosystem size, AWS Agent SDK integration, and April 2026 self-reported LongMemEval scores (94.4% at ~6,787 tokens per query).
Zep / Graphiti is the strongest temporal engine: validity windows on facts, credit-based Zep Cloud, and Graphiti OSS at 27,244 GitHub stars.
Letta (formerly MemGPT) trades a drop-in library for an OS-inspired tiered model: core, recall, and archival memory the agent manages with tools.
Benchmarks now standardize on LoCoMo, LongMemEval, and BEAM, but vendor-reported scores still disagree; run evals on your own workload.
Pricing starts free on all three; Mem0 Pro graph features land at $249/mo, Zep Flex at $125/mo for 50,000 credits.

Why agent memory became infrastructure in 2026

Three years ago, "memory" meant stuffing chat history into a context window. In 2026, it is a dedicated layer with its own benchmarks, funding rounds, and cloud partnerships. Mem0 raised $24 million in October 2025 ($3.9 million seed plus a $20 million Series A led by Basis Set Ventures) and reports serving as the exclusive memory provider for AWS's Agent SDK, with 186 million API calls processed in Q3 2025 alone. (Source: TechCrunch)

The category split is no longer "RAG or not." Updated industry comparisons published in May and June 2026 describe three production patterns: vector-first extraction and retrieval (Mem0), graph-native temporal knowledge (Zep / Graphiti), and OS-inspired tiered context (Letta). Stateless agents still work for one-shot tasks. They do not scale when users expect continuity, when institutional knowledge compounds across runs, or when token bills climb from re-injecting the same context every call.

"Memory is becoming one of their key moats now that LLMs are getting commoditized." - Taranjeet Singh, Mem0 co-founder (Source: TechCrunch)

Mem0's own State of AI Agent Memory 2026 report (April 1, 2026) argues the field now measures memory with reproducible suites rather than ad hoc demos. That shift is what makes a three-way comparison actionable for buyers, not just a feature checklist.

Three architectures teams actually compare

AI agent memory is a persistent storage and retrieval layer that lets agents retain facts, preferences, and procedural knowledge across sessions, separate from the model's finite context window. Each leading platform optimizes a different retrieval shape.

Platform	Memory class	Architecture	Runtime coupling	Self-host path
Mem0	Personalization + some institutional	Vector store + multi-signal retrieval (semantic, BM25, entity linking)	Drop-in library; framework-agnostic	OSS + managed cloud
Zep / Graphiti	Institutional + temporal context	Temporal knowledge graph with validity windows on facts	Memory server (Zep Cloud) or embed Graphiti	Graphiti OSS; Zep Cloud managed
Letta	Both; agent-managed	Three tiers: core (in-context), recall (searchable history), archival (long-term store)	Full agent runtime; tight integration	OSS server + emerging managed cloud

Vector-first systems optimize for fast fact extraction and semantic recall. Graph-native systems optimize for entity relationships and how facts evolve. OS-tiered systems hand memory paging decisions to the agent through explicit tools, which adds control at the cost of adopting a runtime. (Source: Vectorize)

Mem0: vector-first with multi-signal retrieval

Mem0 is a model-agnostic memory layer: developers call add() on conversations, Mem0 extracts atomic facts, and search() returns scoped memories before each LLM turn. Its April 2026 token-efficient algorithm reports 92.5 on LoCoMo and 94.4 on LongMemEval, averaging 6,787 tokens per retrieval query on LongMemEval, with the largest gains on temporal queries (+29.6 points) and multi-hop reasoning (+23.1 points) versus its prior algorithm. (Source: Mem0 State of AI Agent Memory 2026)

The design centers on multi-scope memory: writes carry user_id, agent_id, run_id, and app_id scopes composed at retrieval. Mem0 documents 21 framework integrations and 20 vector backends. Its April 2026 OSS path replaced external graph stores with built-in entity linking in the retrieval score. (Source: Mem0 State of AI Agent Memory 2026)

Strengths: broadest integration surface, AWS Agent SDK exclusivity, and minutes-to-first-memory developer experience.

Tradeoffs: graph-style features on the managed product sit behind higher tiers; independent evaluations cited in the June 2026 Vectorize comparison still list older Mem0 LongMemEval scores near 49.0% (arxiv 2603.04814), well below Mem0's April 2026 self-report. Treat headline benchmarks as hypotheses until you run the open evaluation framework at github.com/mem0ai/memory-benchmarks on your data. (Sources: Vectorize, Mem0)

Operator note (first-hand): A local pip install mem0ai and from mem0 import Memory; memory = Memory() smoke test completed in under 30 seconds on a clean Python 3.11 venv with only OPENAI_API_KEY set. Scope fields (user_id, agent_id, run_id) are visible in the SDK types without extra wiring, which matches Mem0's positioning as a drop-in layer rather than a new agent framework.

Zep and Graphiti: temporal knowledge graphs

Zep is a context-engineering platform built on Graphiti, an open-source temporal knowledge graph engine. Graphiti decomposes ingested Episodes (chat messages, JSON payloads, or text blocks) into entities and edges, each carrying validity windows so the system can answer "who led the project in January?" differently from "who leads it now?" Zep Cloud markets sub-200ms retrieval and reports 63.8% on LongMemEval with GPT-4o, with particular strength in temporal reasoning and preference categories. (Source: Vectorize)

The open-source Graphiti repository shows 27,244 GitHub stars as of June 2026. Teams can self-host Graphiti against Neo4j, FalkorDB, or Kuzu. Zep's managed Community Edition has been deprecated per the Vectorize update; production Zep features flow through Zep Cloud or a custom Graphiti deployment without Zep's higher-level API.

Strengths: best-in-class fact evolution modeling; retrieval and storage unmetered on Zep Cloud; SOC 2 Type II on Enterprise. (Source: Zep Pricing)

Tradeoffs: credit-based pricing punishes large Episodes; steeper mental model than flat vector memory. Zep fits CRM copilots, compliance agents, and support bots where when a fact was true matters as much as what was true.

Letta: OS-inspired tiered memory as a runtime

Letta (the project formerly known as MemGPT) treats the LLM like an operating system managing its own memory hierarchy:

Core memory stays in the active context window (like RAM).
Recall memory holds searchable recent conversation history.
Archival memory is long-term storage the agent queries on demand.

Agents use tools such as core_memory_append, archival_memory_insert, and archival_memory_search to decide what to promote, archive, or retrieve. That is fundamentally different from Mem0 and Zep, where the platform extracts and ranks facts on the agent's behalf.

Letta ships an Agent Development Environment for inspecting memory blocks and raised a $10 million seed led by Felicis Ventures at a reported $70 million post-money valuation. (Source: Vectorize) Managed cloud pricing runs roughly $20 to $200 per month; the OSS server remains the primary self-host path.

Tradeoffs: you adopt a runtime, not a library; setup takes hours, not minutes. Choose Letta when the agent's lifecycle is the memory lifecycle, not when you only need lightweight personalization inside an existing LangGraph workflow.

Benchmarks: what LoCoMo and LongMemEval measure

Three suites now anchor most 2026 comparisons:

Benchmark	Scale	What it stresses
LoCoMo	1,540 questions	Multi-session recall: single-hop, multi-hop, open-domain, temporal
LongMemEval	500 questions	Knowledge updates, preferences, temporal reasoning, multi-session recall
BEAM	1M and 10M token contexts	Production-scale volume where window stuffing fails

Mem0's April 2026 algorithm also reports 64.1 on BEAM (1M) and 48.6 on BEAM (10M), highlighting a ~25% drop as context scales 10x, a signal that temporal abstraction at scale remains an open problem even after recent gains. (Source: Mem0)

Memory recall accuracy: Mem0 vs Zep vs Letta side by side

Platform	LoCoMo	LongMemEval	Who measured it
Mem0 (April 2026 algorithm)	92.5	94.4 (~6,787 tokens/query)	Mem0 self-report
Mem0 (older algorithm)	-	~49.0	Independent eval (arXiv 2603.04814, cited by Vectorize)
Zep / Graphiti (GPT-4o)	-	63.8	Vendor-reported, cited by Vectorize
Letta	-	No current published score	Run your own eval

Read the recall accuracy table as claims with provenance, not a leaderboard: the 45-point gap between Mem0's self-report and the independent replication of its older algorithm is a methodology dispute, not a settled ranking, and Letta publishes no comparable 2026 number at all. (Sources: Mem0 State of AI Agent Memory 2026, Vectorize)

Inference: No single published score settles a buying decision. LoCoMo and LongMemEval focus on conversational histories, not procurement workflows or code-review agents. Use vendor numbers as orientation, then run the Mem0 open eval harness or your own golden-set queries before committing.

Pricing and what actually drives your bill

Platform	Free tier	Paid entry	Cost driver
Mem0	10,000 memory add requests/mo	$19/mo Starter (50,000 adds); $249/mo Pro (500,000 adds)	Memory add + retrieval API volume; graph on Pro
Zep Cloud	1,000 credits/mo	Flex $125/mo (50,000 credits)	Episode byte size at ingestion (1 credit per 350 bytes)
Letta	OSS self-host	~$20-200/mo managed (per Vectorize)	Agent count, model calls, hosting

Mem0 charges per memory add and retrieval request on the platform. Zep charges per Episode ingested; retrieval, storage, and users are unmetered on cloud plans. Letta's managed tiers bundle hosting with agent runtime usage.

For prototype-to-early-production loads, expect under $200 per month on managed Mem0 Starter or Zep Flex unless your agents write large Episodes or re-index constantly. At scale, Mem0's token-per-query efficiency (under 7,000 tokens on LongMemEval per Mem0's April 2026 report) matters as much as list price. (Sources: Mem0 Pricing, Zep Pricing)

Which platform should you pick?

Pick Mem0 if you need the widest integration surface, AWS Agent SDK alignment, and the fastest drop-in path for user personalization across LangChain, CrewAI, or a custom stack. Budget for Pro at $249/mo if you need managed graph-style entity features at scale.

Pick Zep if your agent must reason over changing relationships and timestamps: customer issue evolution, org charts, compliance timelines, or "used to live in London, now Tokyo" preference shifts. Start with Graphiti OSS when you want data control; choose Zep Cloud when you want managed temporal GraphRAG without operating graph infra.

Pick Letta if you are designing stateful, long-horizon agents that should decide what enters context, what archives, and what gets forgotten. Accept runtime coupling in exchange for self-editing memory blocks and visual debugging in the ADE.

Most mature stacks combine a memory platform with caching underneath. Pair Mem0 or Zep with Redis session cache when latency-sensitive paths cannot wait for multi-signal retrieval. (Source: Vectorize)

Frequently asked questions

What is the best AI agent memory platform in 2026?

For a default production choice, Mem0 offers the largest ecosystem, AWS Agent SDK partnership, and quickest integration path. Zep is the better engine when temporal fact evolution is core to the product. Letta is the right foundation when you want agents to manage their own memory tiers inside a dedicated runtime.

What is the difference between Mem0 and Zep?

Mem0 extracts atomic facts into vector-backed stores with multi-signal retrieval and optional entity linking. Zep ingests Episodes into a temporal knowledge graph where facts carry validity windows. Mem0 optimizes drop-in personalization; Zep optimizes "what was true when?" queries across evolving entities.

Is Letta the same as MemGPT?

Yes. Letta is the current name for the project originally published as MemGPT, based on the peer-reviewed OS-inspired memory hierarchy paper. The MemGPT paging metaphor (core, recall, archival) survives in Letta's architecture.

How do LoCoMo and LongMemEval compare memory systems?

LoCoMo (1,540 questions) stresses multi-session conversational recall across four categories. LongMemEval (500 questions) adds knowledge updates, preference tracking, and harder temporal reasoning. Together they replaced ad hoc demos, though both focus on chat histories rather than task-execution agents.

Can I self-host Mem0, Zep, and Letta?

Mem0 and Letta offer full open-source self-host paths. Zep Cloud is managed; self-hosting means running Graphiti directly without deprecated Zep Community Edition. All three support local development without cloud dependency when you bring your own vector or graph database.

How much does agent memory infrastructure cost?

All three ship meaningful free tiers: Mem0 at 10,000 adds per month, Zep at 1,000 credits per month, Letta via OSS. Paid managed entry sits near $19 to $125 per month for early production. Enterprise compliance (HIPAA BAA, guaranteed SLAs) pushes teams to custom Mem0 Enterprise or Zep Enterprise contracts.

References

Graphiti - https://github.com/getzep/graphiti
Mem0 Pricing - https://mem0.ai/pricing
Mem0 State of AI Agent Memory 2026 - https://mem0.ai/blog/state-of-ai-agent-memory-2026
Mem0 raises $24M (TechCrunch) - https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps
Vectorize best AI agent memory systems - https://vectorize.io/articles/best-ai-agent-memory-systems
Zep Pricing - https://www.getzep.com/pricing

Mem0 vs Zep vs Letta: Agent Memory Compared (2026)

Mem0 vs Zep vs Letta: Agent Memory Compared (2026)

Why agent memory became infrastructure in 2026

Three architectures teams actually compare

Mem0: vector-first with multi-signal retrieval

Zep and Graphiti: temporal knowledge graphs

Letta: OS-inspired tiered memory as a runtime

Benchmarks: what LoCoMo and LongMemEval measure

Memory recall accuracy: Mem0 vs Zep vs Letta side by side

Pricing and what actually drives your bill

Which platform should you pick?

Frequently asked questions

What is the best AI agent memory platform in 2026?

What is the difference between Mem0 and Zep?

Is Letta the same as MemGPT?

How do LoCoMo and LongMemEval compare memory systems?

Can I self-host Mem0, Zep, and Letta?

How much does agent memory infrastructure cost?

References

AgenticWire Desk

Related Coverage

vLLM vs SGLang: Throughput, Latency, When to Use Each

LiteLLM vs Kong AI Gateway: Which One to Self-Host in 2026

Ollama vs llama.cpp: Speed, Overhead, Which Local LLM Runner

Mem0 vs Zep vs Letta: Agent Memory Compared (2026)

Why agent memory became infrastructure in 2026

Three architectures teams actually compare

Mem0: vector-first with multi-signal retrieval

Zep and Graphiti: temporal knowledge graphs

Letta: OS-inspired tiered memory as a runtime

Benchmarks: what LoCoMo and LongMemEval measure

Memory recall accuracy: Mem0 vs Zep vs Letta side by side

Pricing and what actually drives your bill

Which platform should you pick?

Frequently asked questions

What is the best AI agent memory platform in 2026?

What is the difference between Mem0 and Zep?

Is Letta the same as MemGPT?

How do LoCoMo and LongMemEval compare memory systems?

Can I self-host Mem0, Zep, and Letta?

How much does agent memory infrastructure cost?

Related coverage

References

AgenticWire Desk

Related Coverage

vLLM vs SGLang: Throughput, Latency, When to Use Each

LiteLLM vs Kong AI Gateway: Which One to Self-Host in 2026

Ollama vs llama.cpp: Speed, Overhead, Which Local LLM Runner