We're hiring! Come build with us
Zep
Zep vs. Cognee

Agent memory built to run, not assemble

Cognee is an open-source toolkit you wire together and operate. Zep is agent memory at enterprise scale, delivered as a managed Context Lake — bi-temporal context graphs, governance in the substrate, and sub-200ms retrieval, out of the box.

Start building
Key takeaways

The pieces, or the system

  • Cognee is an open-source ECL pipeline you point at your own graph and vector backends, then host and operate. Zep is a managed Context Lake — one runtime, one SDK.
  • Zep models time at the data layer: every fact carries valid-from and valid-to timestamps, so point-in-time queries follow from the model rather than logic you build above it.
  • Governance — ABAC, retention with legal hold, audit — is enforced in the substrate, alongside SOC 2 Type II, HIPAA, and BYOC.
  • Zep reports 94.7% LoCoMo accuracy at 87ms p50 and 90.2% on LongMemEval at 104ms (results).
The distinction

Cognee gives you the pieces. Zep runs the system.

What Cognee is. Cognee is an open-source ECL (extract, cognify, load) pipeline. You point it at your own graph and vector backends, then host and operate the result yourself. For teams that want a fully open-source core and maximum control over those backends, that's the appeal.

What Zep is. Zep manages, governs, and serves agent memory for you — the Context Lake for AI agents. It ingests chat, JSON, app events, documents, and business data through a single SDK, unifies them in one bi-temporal context graph per subject (via open-source Graphition Zep's Context Graph Engine), and serves token-efficient context in sub-200ms — with no backend cluster to size, shard, and keep alive.

Agent Runtime

LangChain·LlamaIndex·CrewAI·Google ADK·custom

Any agent framework — or none. The Context Lake is invoked through a single SDK.

Ingestion

chat·JSON·documents·app events

Raw signal arrives from any source the agent touches.

Context Assembly

context blocks·templates·token-efficient

Relevant context is assembled on demand into token-efficient blocks.

entity extraction·relationships·ontology·invalidation

Signal becomes a temporal context graph as new facts arrive and stale ones are invalidated.

Retrieval

sub-200ms·auto-optimized·provenance-linked·policy-filtered

Selects what's relevant and what adds the most information within the token budget.

Governance

ABAC·multi-tenant isolation·customer key encryption·retention policies·audit·provenance

Native to the substrate, not a layer bolted on. Every read and write is policy-gated for access and provenance; retention runs across the data lifecycle.

Context Graph Engine

entities·facts & edges·decision traces·episodes

Temporal context graph with provenance — sub-200ms retrieval at scale.

Benchmarks

Accuracy, latency, and token efficiency — published

Zep's results on the two standard long-running memory benchmarks, single retrieval call, no agentic loops.

  • LoCoMo · 1,540 questions — 94.7% accuracy, 87ms retrieval at p50, 5,760 tokens per query.
  • LongMemEval · 500 questions — 90.2% accuracy, 104ms retrieval at p50, 4,408 tokens per query.
  • Cognee publishes its own evaluation figure, which is measured on a different task and is not directly comparable to LoCoMo. See the full methodology and results.
How they compare

Cognee vs. Zep, side by side

CogneeZep
DeliveryOpen-source ECL toolkit — self-hosted and operatedManaged Context Lake — one runtime, one SDK
Data sourcesPoint at your own graph + vector backendsChat, JSON, events, documents, business data via one SDK, unified per subject
Entities & schemaAuto-generated ontologies you then correctCustom entities and edges, your schema enforced at ingest
Temporal modelNot bi-temporal at the data modelBi-temporal facts (valid-from / valid-to), point-in-time queries
GovernanceAssemble it yourselfABAC, retention with legal hold, audit — in the substrate
DeploymentSelf-host the stackManaged, BYOK, or BYOC (AWS / GCP / Azure)
BenchmarksOwn eval (not comparable to LoCoMo)94.7% LoCoMo (87ms), 90.2% LongMemEval (104ms)
ScaleYou size and operate the stackMillions of context graphs, sub-200ms at scale
When to choose

Pick the tool that fits the team

Stay with Cognee when

You want a fully open-source core and you're prepared to assemble and operate the stack.

  • You want a fully open-source core and maximum control over graph and vector backends
  • Assembling and operating your own memory stack is acceptable — or preferred
  • A single-developer or early-stage project matters more than governed scale
Choose Zep when you need

Agent memory served as a managed runtime, not a stack you host and operate.

  • Bi-temporal facts with point-in-time queries built into the data model
  • Business data integrated alongside chat — CRM, support, billing, events
  • Custom entities and relationships, with your schema enforced
  • Retrieval that holds at sub-200ms across millions of subjects
  • Entity-level governance — ABAC, retention, audit — plus SOC 2 Type II, HIPAA, BYOC
Get started

Ready to run agent memory in production?

Start building
FAQ

Frequently asked questions

Is Cognee a good agent-memory option?

Cognee is an open-source ECL (extract, cognify, load) pipeline you point at your own graph and vector backends, then host and operate. If you want a fully open-source core and maximum control over those backends, it's a fit. If you need agent memory served as a managed runtime — with bi-temporal facts, entity-level governance, and sub-200ms retrieval at scale — evaluate Zep.

Does Zep replace the graph and vector stores Cognee assembles?

Yes. Zep is a managed Context Lake — one runtime and one SDK. The graph, vector, and BM25 indexes are held and served for you, so there is no cluster of backends to size, shard, and keep alive.

Can Zep run inside my own environment?

Zep runs managed, with your own keys (BYOK), or fully inside your VPC (BYOC) on AWS, GCP, or Azure. The trust boundary moves with the deployment.