Is a Context Lake the same as a data lake?

No, and it doesn't replace one. A data lake serves analytics; a Context Lake serves agents. Different data, different consumers, different access patterns — same governance rigor. They're complementary.

Is a Context Lake just a vector database or a knowledge graph?

No. It's a governed system that manages many temporal context graphs and serves them at scale with millisecond retrieval, access control, retention, and audit. A single vector store or graph database is a component, not the governed substrate.

How is a Context Lake different from agent memory?

Agent memory is the category — everything an agent knows over time. A Context Lake is the infrastructure that implements agent memory at enterprise scale.

What powers Zep's Context Lake?

The Context Graph Engine (the proprietary runtime) with Graphiti (open source) constructing the graphs on top of it.

How does a Context Lake scale to millions of users?

With tiered storage and a hot-graph strategy: only active graphs sit in memory; the rest are snapshotted to cheap object storage and rehydrated in milliseconds. Retrieval latency stays roughly constant as the graph count grows.

How is cost controlled at that scale?

Cost tracks active graphs, not total graphs — cold graphs live on inexpensive object storage, so a large deployment with a small hot fraction pays mainly for the active set.

Can I run a Context Lake in my own environment?

Yes — managed cloud, with your own encryption keys (BYOK), or inside your own VPC (BYOC). The trust boundary moves with the deployment.

What Is a Context Lake? Enterprise Agent Memory

Agent Runtime

LangChain·LlamaIndex·CrewAI·Google ADK·custom

Any agent framework — or none. The Context Lake is invoked through a single SDK.

Ingestion

chat·JSON·documents·app events

Raw signal arrives from any source the agent touches.

Context Assembly

context blocks·templates·token-efficient

Relevant context is assembled on demand into token-efficient blocks.

Graphiti

Learn more

entity extraction·relationships·ontology·invalidation

Signal becomes a temporal context graph as new facts arrive and stale ones are invalidated.

Retrieval

sub-200ms·auto-optimized·provenance-linked·policy-filtered

Selects what's relevant and what adds the most information within the token budget.

Governance

ABAC·multi-tenant isolation·customer key encryption·retention policies·audit·provenance

Native to the substrate, not a layer bolted on. Every read and write is policy-gated for access and provenance; retention runs across the data lifecycle.

Context Graph Engine

entities·facts & edges·decision traces·episodes

Temporal context graph with provenance — sub-200ms retrieval at scale.

	Data lake	Context Lake
Holds	Raw + processed business data	Context graphs (what agents know)
Consumers	BI tools, analysts, ML pipelines	AI agents
Access pattern	Batch + query	Millisecond retrieval at run time
Shared trait	Governed at the substrate	Governed at the substrate

What Is a Context Lake?

Agent Runtime

Ingestion

Context Assembly

Graphiti

Retrieval

Governance

Context Graph Engine

Key takeaways

The problem it solves

The data-lake parallel

Structured, quantitative

Unstructured, qualitative

SQL, batch analytics

Graph traversal, semantic

Seconds-to-minutes

Sub-200ms retrieval

Dashboards & ML

Agents & assistants

Row & column ACLs

Entity-level ABAC

What a Context Lake is made of

How it manages, governs, and serves

How it serves millions of graphs at low latency

Deployment: the trust boundary moves with you

Who it's for

Frequently asked questions