Google PM Launches Open-Source LLM-Driven Persistent Memory Agent

Taylor Keatsman

Published: March 7, 2026 6:08 PM ET

Google PM Launches Open-Source LLM-Driven Persistent Memory Agent

Shubham Saboo, a senior AI product manager at Google, has unveiled a pioneering project focused on persistent memory in agent design. This week, he launched an open-source “Always On Memory Agent” on Google Cloud Platform’s GitHub page under the MIT License, which permits commercial applications. The agent utilizes Google’s Agent Development Kit (ADK), introduced in Spring 2025, and leverages the Gemini 3.1 Flash-Lite model, released on March 3, 2026, heralded as the most cost-efficient and rapid variant in the Gemini series.

Overview of the Always On Memory Agent

The “Always On Memory Agent” serves as a practical reference for an agent system capable of continuous information intake and memory retrieval without traditional vector databases. This innovation is significant for enterprise developers seeking a new direction in agent infrastructure.

Key Features and Architecture

Built to run continuously, ingesting content from various sources.
Stores structured memories in SQLite and carries out memory consolidation every 30 minutes.
Includes local HTTP API and Streamlit dashboard for easier interface management.
Supports multiple content formats: text, image, audio, video, and PDF.

Saboo’s design proposition—”No vector database. No embeddings. Just an LLM that reads, thinks, and writes structured memory”—aims to attract developers concerned about cost and operational complexity. This approach diminishes the need for traditional retrieval stacks, which often require cumbersome embedding and indexing processes.

Performance of Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite plays a crucial role in making the always-on model economically viable. Priced affordably, it costs $0.25 per million input tokens and $1.50 for output tokens. Flash-Lite boasts a speed increase of 2.5 times compared to its predecessor, Gemini 2.5 Flash, while also achieving superior output quality.

Elo score of 1432 on Arena.ai
86.9% on GPQA Diamond
76.8% on MMMU Pro

This model’s attributes align well with high-frequency tasks such as translation and workflow automation. Its capability to maintain low inference costs while offering predictable latency reinforces the appeal of persistent memory solutions.

Governance and Operational Challenges

As the enterprise landscape evolves, the implementation of persistent memory raises governance questions. Stakeholders have expressed concerns about issues of compliance, allowing agents to operate with memory that can continuously merge and evolve without clear boundaries.

Franck Abe highlighted potential compliance risks with continuous memory retention.
Others pointed out liabilities concerning operational management of persistent systems.

As persistent memory systems become integral to enterprise AI, discussions around governance will become increasingly important alongside their capabilities. Developers will need to balance the simplicity of the architecture with compliance and retrieval issues.

Conclusion: A New Frontier for Agent Infrastructure

Shubham Saboo’s launch of the Always On Memory Agent presents a substantial step forward in the evolution of AI agents. This initiative provides a foundation for future enhancements in governance and operational structure within enterprise settings. As teams transition from isolated applications to comprehensive systems capable of persistent memory, the focus will turn to how safely and effectively these systems can operate in production environments. The true impact of this release will hinge on its ability to address both functionality and governance concerns.