Gemini Embedding 2 Launches In Public Preview As Google Expands Multimodal AI Search

Mo. Basuony

Published: March 12, 2026 10:07 AM ET

Gemini Embedding 2

Google has launched Gemini Embedding 2 in public preview, introducing what it describes as its first natively multimodal embedding model for developers who need search, retrieval and recommendation systems that work across more than just text.

The new model, released on March 10, gives developers a way to map text, images, video, audio and PDFs into a single embedding space. That means a system can compare different media types more directly, a step that matters for companies building cross-media search, document retrieval and classification tools.

What Gemini Embedding 2 Does

Embeddings turn content into numerical representations that software can compare for similarity. In practical terms, that helps applications find related documents, rank search results, surface recommendations or retrieve relevant context for AI systems.

What changes with Gemini Embedding 2 is the multimodal scope. Instead of generating embeddings only for text, the new model is built to handle multiple input types inside one shared representation layer. That allows developers to search across formats rather than maintaining separate systems for text, images or other media.

For enterprise and developer use cases, that could be especially useful in large archives where important information is spread across scanned files, visual assets, recorded media and traditional documents.

Why This Release Matters Now

The launch reflects a broader shift in AI development away from text-only retrieval and toward systems that can understand mixed-media data. Many modern datasets no longer live in a single format, and retrieval systems are increasingly expected to match a user’s query against screenshots, slides, PDFs, audio clips and video segments as well as written text.

That makes embedding models a critical layer in the AI stack, particularly for retrieval-augmented generation, recommendation engines and semantic search. By moving into multimodal embeddings, Google is signaling that developers should treat cross-format search as a mainstream capability rather than a niche one.

The public preview timing also suggests Google wants developers to start testing real-world workloads before wider production adoption.

Public Preview Status And Current Availability

For now, Gemini Embedding 2 is available as a preview model rather than a generally available product. In Google’s release notes, the model is listed as gemini-embedding-2-preview.

That preview label matters. It usually means features, performance characteristics and production guidance can still evolve before a full release. Developers evaluating it now are likely to focus on relevance quality, latency, cost and how well a single embedding space holds up across very different content types.

Even so, the launch is notable because it gives Google a clearer answer to rising demand for unified retrieval systems, especially from teams building AI applications on top of multimodal knowledge bases.

How It Fits Into Google’s Existing Embedding Push

Google had already been building out its embedding lineup for search and retrieval tasks. Earlier Gemini embedding models were centered on text and retrieval use cases. Gemini Embedding 2 expands that strategy into multimodal territory and does so at a moment when AI product teams are under pressure to make assistants and search tools work across everything users actually store.

That puts the model in an important product slot: not as a chatbot headline feature, but as infrastructure that can quietly determine whether an AI system retrieves the right information in the first place.

In many production AI applications, that layer matters as much as the model generating the final answer.

What Developers Will Watch Next

The next phase is likely to be less about the announcement itself and more about adoption signals. Developers will want to know how well Gemini Embedding 2 performs on cross-modal search benchmarks, how stable it is under large-scale workloads, and whether the preview model becomes a standard option inside broader Google AI and cloud tooling.

For now, the key development is clear: Google has moved its embedding strategy beyond text and into unified multimodal retrieval. For developers building search and recommendation systems around mixed media, Gemini Embedding 2 is one of the more consequential infrastructure launches of the week.