Agent Runtime
Any agent framework — or none. The Context Lake is invoked through a single SDK.
Vertex AI Memory Bank is Google's managed memory service for agents in Vertex AI Agent Engine. Zep is a neutral, multi-LLM, multi-cloud Context Lake that manages, governs, and serves agent memory on temporal context graphs.
What Vertex AI Memory Bank is. Memory Bank is part of Google's Vertex AI Agent Engine. It extracts and stores memories for agents built on Vertex, and surfaces them back at run time, integrated with the Google Cloud agent stack. For teams standardized on Vertex/GCP, the native integration is the draw.
What Zep is. Zep is a dedicated, neutral memory layer — the Context Lake for AI agents. It builds bi-temporal context graphs from chat, business data, and documents (open-source Graphiti on Zep's Context Graph Engine), serves token-efficient context in sub-200ms p95, and deploys as managed cloud, BYOK, or BYOC on the cloud you choose. It's model- and framework-agnostic.
Any agent framework — or none. The Context Lake is invoked through a single SDK.
Raw signal arrives from any source the agent touches.
Relevant context is assembled on demand into token-efficient blocks.
Signal becomes a temporal context graph as new facts arrive and stale ones are invalidated.
Selects what's relevant and what adds the most information within the token budget.
Native to the substrate, not a layer bolted on. Every read and write is policy-gated for access and provenance; retention runs across the data lifecycle.
Temporal context graph with provenance — sub-200ms retrieval at scale.
| Vertex AI Memory Bank | Zep | |
|---|---|---|
| Ecosystem | Bound to Google Cloud / Vertex | Neutral — any model, any cloud |
| Model providers | Google-centric (Gemini) | OpenAI, Anthropic, Meta, Gemini, others |
| Memory model | Memories extracted from session history (Agent Engine Sessions + Memory Bank) | Bi-temporal context graph (provenance + validity) |
| Temporal reasoning | No — extraction-based; no temporal graph | “What's true now / what was true then,” auto fact invalidation |
| Deployment | GCP / Vertex | Managed, BYOK, or BYOC (AWS/GCP/Azure) |
| Benchmarks | — | 94.7% LoCoMo (155ms), 90.2% LongMemEval (162ms) |
| Lock-in risk | Higher (ecosystem-bound) | Lower (portable across stacks) |
The same logic S&P Global Market Intelligence flagged for hyperscaler primitives applies here: your agents' memory is among the most valuable and sticky data you own. A bundled memory service keeps it inside one cloud's ecosystem. A neutral layer lets you keep one consistent context strategy across Gemini, OpenAI, Anthropic, and Meta — and move between clouds without re-platforming your memory.
You're standardized on Google Cloud and Vertex, use Gemini as your primary model, and the bundled memory meets your needs.
You want neutrality across models and clouds, with governed memory at scale.
If you're committed to Vertex/GCP and Gemini and need managed memory, it can be. For neutrality, temporal reasoning, provenance, and portable governance, evaluate a dedicated layer like Zep.
Yes — managed, with your own keys, or inside your own VPC on GCP (or AWS/Azure).
Zep is model-agnostic and works with Gemini as well as OpenAI, Anthropic, and others — so memory isn't tied to one model provider.