Skip to main content

A Transparent Proxy

Memory Router sits between your application and the model provider. Your app talks to the Router exactly as it would talk to the provider — same payload, same SDK, same response shape — and the Router adds a memory layer transparently on the way through. Because the Router speaks your provider’s protocol, integrating it is a base-URL change. Nothing else in your code needs to move.

The Four Steps

1

Intercept

Your app sends the request to Memory Router instead of the provider — same payload, same SDK, same response shape.
2

Optimize context

The Router trims redundant history, searches prior memories, and injects only the relevant context into the prompt.
3

Forward

The enhanced request goes to the model — your own provider (BYOK) or a MemoryLake-hosted model. Tokens in are lower than a raw replay.
4

Remember

New memories are extracted and stored asynchronously in the background — the response is never delayed.

Why It Saves Tokens

Instead of replaying the entire conversation each turn, the Router removes redundant context and injects only the relevant memories. As conversations grow, fewer tokens are sent per call — and you never hit the context-window ceiling that forces mid-task truncation.
Memory extraction and storage happen asynchronously. The model’s response is returned as soon as it is ready; writing new memories never adds latency to the call.

Shared Memory Pool

The Router and the MemoryLake API operate on the same memory pool. Anything stored through the Router is retrievable through the API, and vice versa — one source of truth across sessions, apps, and models.

Next Steps

Quickstart

Go live in three steps.

Deployment Modes

BYOK vs. MemoryLake-hosted, base URLs, and supported providers.