deepset-ai · anakin87 · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026
@@ -31,6 +31,7 @@ These are the Embedders available in Haystack:
 | [FastembedSparseDocumentEmbedder](embedders/fastembedsparsedocumentembedder.mdx)                     | Enriches a list of documents with their sparse embeddings using the models supported by Fastembed.                                                                                                                                          |
 | [GoogleGenAITextEmbedder](embedders/googlegenaitextembedder.mdx)                                       | Embeds a simple string (such as a query) with a Google AI model. Requires an API key from Google.                                                                                                                                           |
 | [GoogleGenAIDocumentEmbedder](embedders/googlegenaidocumentembedder.mdx)                               | Embeds a list of documents with a Google AI model. Requires an API key from Google.                                                                                                                                                         |
+| [GoogleGenAIMultimodalDocumentEmbedder](embedders/googlegenaimultimodaldocumentembedder.mdx)           | Embeds a list of non-textual documents with a Google AI model. Requires an API key from Google.                                                                                                                                                         |
 | [HuggingFaceAPIDocumentEmbedder](embedders/huggingfaceapidocumentembedder.mdx)                       | Computes document embeddings using various Hugging Face APIs.                                                                                                                                                                               |
 | [HuggingFaceAPITextEmbedder](embedders/huggingfaceapitextembedder.mdx)                               | Embeds strings using various Hugging Face APIs.                                                                                                                                                                                             |
 | [JinaTextEmbedder](embedders/jinatextembedder.mdx)                                                   | Embeds a simple string (such as a query) with a Jina AI Embeddings model. Requires an API key from Jina AI.                                                                                                                                 |
@@ -56,4 +57,4 @@ These are the Embedders available in Haystack:
 | [VertexAITextEmbedder](embedders/vertexaitextembedder.mdx)                                             | Computes embeddings for text (such as a query) using models through VertexAI Embeddings API. **_This integration will be deprecated soon. We recommend using [GoogleGenAITextEmbedder](embedders/googlegenaitextembedder.mdx) integration instead._** |
 | [VertexAIDocumentEmbedder](embedders/vertexaidocumentembedder.mdx)                                     | Computes embeddings for documents using models through VertexAI Embeddings API. **_This integration will be deprecated soon. We recommend using [GoogleGenAIDocumentEmbedder](embedders/googlegenaidocumentembedder.mdx)  integration instead._**     |
 | [WatsonxTextEmbedder](embedders/watsonxtextembedder.mdx)                                               | Computes embeddings for text (such as a query) using IBM Watsonx models.                                                                                                                                                                    |
-| [WatsonxDocumentEmbedder](embedders/watsonxdocumentembedder.mdx)                                       | Computes embeddings for documents using IBM Watsonx models.                                                                                                                                                                                 |
+| [WatsonxDocumentEmbedder](embedders/watsonxdocumentembedder.mdx)                                       | Computes embeddings for documents using IBM Watsonx models.                                                                                                                                                                                 |
@@ -0,0 +1,196 @@
+---
+title: "GoogleGenAIMultimodalDocumentEmbedder"
+id: googlegenaimultimodaldocumentembedder
+slug: "/googlegenaimultimodaldocumentembedder"
+description: "`GoogleGenAIMultimodalDocumentEmbedder` computes the embeddings of a list of non-textual documents and stores the obtained vectors in the embedding field of each document."
+---
+
+# GoogleGenAIMultimodalDocumentEmbedder
+
+`GoogleGenAIMultimodalDocumentEmbedder` computes the embeddings of a list of non-textual documents and stores the obtained vectors in the embedding field of each document.
+It uses Google AI multimodal embedding models with the ability to embed text, images, videos, and audio into the same vector space.
+<div className="key-value-table">
+
+|  |  |
+| --- | --- |
+| **Most common position in a pipeline** | Before a [DocumentWriter](../writers/documentwriter.mdx) in an indexing pipeline |
+| **Mandatory init variables** | `api_key`: The Google API key. Can be set with `GOOGLE_API_KEY` or `GEMINI_API_KEY` env var. |
+| **Mandatory run variables** | `documents`:  A list of documents, with a meta field containing an image file path |
+| **Output variables** | `documents`: A list of documents (enriched with embeddings)  <br /> <br />`meta`: A dictionary of metadata |
+| **API reference** | [Google AI](/reference/integrations-google-genai) |
+| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/google_genai |
+
+</div>
+
+## Overview
+
+`GoogleGenAIMultimodalDocumentEmbedder` expects a list of documents containing a file path in a meta field. The meta field can be specified with the `file_path_meta_field` init parameter of this component.
+
+The embedder efficiently loads the files, computes the embeddings using a Google AI model, and stores each of them in the `embedding` field of the document.
+
+`GoogleGenAIMultimodalDocumentEmbedder` is commonly used in indexing pipelines. At retrieval time, you need to use the same model with a `GoogleGenAITextEmbedder` to embed the query, before using an Embedding Retriever.
+
+This component is compatible with Gemini multimodal models: `gemini-2-embedding-preview` and later. For a complete list of supported models, see the [Google AI documentation](https://ai.google.dev/gemini-api/docs/embeddings).
+
+To embed a textual document, you should use the [`GoogleGenAIDocumentEmbedder`](googlegenaidocumentembedder.mdx).
+To embed a string, you should use the [`GoogleGenAITextEmbedder`](googlegenaitextembedder.mdx).
+
+To start using this integration with Haystack, install it with:
+
+```shell
+pip install google-genai-haystack
+```
+
+### Authentication
+
+Google Gen AI is compatible with both the Gemini Developer API and the Vertex AI API.
+
+To use this component with the Gemini Developer API and get an API key, visit [Google AI Studio](https://aistudio.google.com/).
+To use this component with the Vertex AI API, visit [Google Cloud > Vertex AI](https://cloud.google.com/vertex-ai).
+
+The component uses a `GOOGLE_API_KEY` or `GEMINI_API_KEY` environment variable by default. Otherwise, you can pass an API key at initialization with a [Secret](../../concepts/secret-management.mdx) and `Secret.from_token` static method:
+
+```python
+embedder = GoogleGenAIMultimodalDocumentEmbedder(
+    api_key=Secret.from_token("<your-api-key>"),
+)
+```
+
+The following examples show how to use the component with the Gemini Developer API and the Vertex AI API.
+
+#### Gemini Developer API (API Key Authentication)
+
+```python
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAIMultimodalDocumentEmbedder,
+)
+
+## set the environment variable (GOOGLE_API_KEY or GEMINI_API_KEY)
+embedder = GoogleGenAIMultimodalDocumentEmbedder()
+```
+
+#### Vertex AI (Application Default Credentials)
+
+```python
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAIMultimodalDocumentEmbedder,
+)
+
+## Using Application Default Credentials (requires gcloud auth setup)
+embedder = GoogleGenAIMultimodalDocumentEmbedder(
+    api="vertex",
+    vertex_ai_project="my-project",
+    vertex_ai_location="us-central1",
+)
+```
+
+#### Vertex AI (API Key Authentication)
+
+```python
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAIMultimodalDocumentEmbedder,
+)
+
+## set the environment variable (GOOGLE_API_KEY or GEMINI_API_KEY)
+embedder = GoogleGenAIMultimodalDocumentEmbedder(api="vertex")
+```
+
+## Usage
+
+### On its own
+
+Here is how you can use the component on its own. You'll need to pass in your Google API key via Secret or set it as an environment variable called `GOOGLE_API_KEY` or `GEMINI_API_KEY`.
+The examples below assume you've set the environment variable.
+
+```python
+from haystack import Document
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAIMultimodalDocumentEmbedder,
+)
+
+docs = [
+    Document(meta={"file_path": "path/to/image.jpg"}),
+    Document(meta={"file_path": "path/to/video.mp4"}),
+    Document(meta={"file_path": "path/to/pdf.pdf", "page_number": 1}),
+    Document(meta={"file_path": "path/to/pdf.pdf", "page_number": 3}),
+]
+
+document_embedder = GoogleGenAIMultimodalDocumentEmbedder()
+
+result = document_embedder.run(documents=docs)
+print(result["documents"][0].embedding)
+## [0.017020374536514282, -0.023255806416273117, ...]
+```
+
+### Setting embedding dimensions
+
+Models like `gemini-2-embedding-preview` have a default embedding dimension of 3072, but, thanks to
+Matryoshka Representation Learning, it's possible to reduce embedding size while keeping similar performance.
+
+Check the [Google AI documentation](https://ai.google.dev/gemini-api/docs/embeddings#control-embedding-size) for more information.
+
+```python
+from haystack import Document
+
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAIMultimodalDocumentEmbedder,
+)
+
+docs = [Document(meta={"file_path": "path/to/image.jpg"})]
+
+doc_multimodal_embedder = GoogleGenAIMultimodalDocumentEmbedder(
+    config={"output_dimensionality": 768},
+)
+docs_with_embeddings = doc_multimodal_embedder.run(docs)["documents"]
+```
+
+### In a pipeline
+
+In the following example, we look for a specific plot in the "Scaling Instruction-Finetuned Language Models" paper (PDF format).
+
+You first need to download the PDF file from https://arxiv.org/pdf/2210.11416.pdf.
+
+```python
+from haystack import Document
+from haystack import Pipeline
+from haystack.document_stores.in_memory import InMemoryDocumentStore
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAITextEmbedder,
+)
+from haystack_integrations.components.embedders.google_genai import (
+    GoogleGenAIMultimodalDocumentEmbedder,
+)
+from haystack.components.writers import DocumentWriter
+from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
+
+document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
+
+paper_path = "2210.11416.pdf"
+
+documents = [
+    Document(meta={"file_path": paper_path, "page_number": i}) for i in range(1, 16)
+]
+
+indexing_pipeline = Pipeline()
+indexing_pipeline.add_component("embedder", GoogleGenAIMultimodalDocumentEmbedder())
+indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))
+indexing_pipeline.connect("embedder", "writer")
+
+indexing_pipeline.run({"embedder": {"documents": documents}})
+
+query_pipeline = Pipeline()
+query_pipeline.add_component("text_embedder", GoogleGenAITextEmbedder())
+query_pipeline.add_component(
+    "retriever",
+    InMemoryEmbeddingRetriever(document_store=document_store),
+)
+query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
+
+query = "plot showing BBH accuracy"
+
+result = query_pipeline.run({"text_embedder": {"text": query}})
+
+print(result["retriever"]["documents"][0].meta)
+
+# {'file_path': '2210.11416.pdf', 'page_number': 9}
+```
@@ -282,6 +282,7 @@ export default {
             'pipeline-components/embedders/fastembedtextembedder',
             'pipeline-components/embedders/googlegenaidocumentembedder',
             'pipeline-components/embedders/googlegenaitextembedder',
+            'pipeline-components/embedders/googlegenaimultimodaldocumentembedder',
             'pipeline-components/embedders/huggingfaceapidocumentembedder',
             'pipeline-components/embedders/huggingfaceapitextembedder',
             'pipeline-components/embedders/jinadocumentembedder',

@@ -31,6 +31,7 @@ These are the Embedders available in Haystack:
 | [FastembedSparseDocumentEmbedder](embedders/fastembedsparsedocumentembedder.mdx)                     | Enriches a list of documents with their sparse embeddings using the models supported by Fastembed.                                                                                                                                          |
 | [GoogleGenAITextEmbedder](embedders/googlegenaitextembedder.mdx)                                       | Embeds a simple string (such as a query) with a Google AI model. Requires an API key from Google.                                                                                                                                           |
 | [GoogleGenAIDocumentEmbedder](embedders/googlegenaidocumentembedder.mdx)                               | Embeds a list of documents with a Google AI model. Requires an API key from Google.                                                                                                                                                         |
+| [GoogleGenAIMultimodalDocumentEmbedder](embedders/googlegenaimultimodaldocumentembedder.mdx)           | Embeds a list of non-textual documents with a Google AI model. Requires an API key from Google.                                                                                                                                           |
 | [HuggingFaceAPIDocumentEmbedder](embedders/huggingfaceapidocumentembedder.mdx)                       | Computes document embeddings using various Hugging Face APIs.                                                                                                                                                                               |
 | [HuggingFaceAPITextEmbedder](embedders/huggingfaceapitextembedder.mdx)                               | Embeds strings using various Hugging Face APIs.                                                                                                                                                                                             |
 | [JinaTextEmbedder](embedders/jinatextembedder.mdx)                                                   | Embeds a simple string (such as a query) with a Jina AI Embeddings model. Requires an API key from Jina AI.                                                                                                                                 |
@@ -56,4 +57,4 @@ These are the Embedders available in Haystack:
 | [VertexAITextEmbedder](embedders/vertexaitextembedder.mdx)                                             | Computes embeddings for text (such as a query) using models through VertexAI Embeddings API. **_This integration will be deprecated soon. We recommend using [GoogleGenAITextEmbedder](embedders/googlegenaitextembedder.mdx) integration instead._** |
 | [VertexAIDocumentEmbedder](embedders/vertexaidocumentembedder.mdx)                                     | Computes embeddings for documents using models through VertexAI Embeddings API. **_This integration will be deprecated soon. We recommend using [GoogleGenAIDocumentEmbedder](embedders/googlegenaidocumentembedder.mdx)  integration instead._**     |
 | [WatsonxTextEmbedder](embedders/watsonxtextembedder.mdx)                                               | Computes embeddings for text (such as a query) using IBM Watsonx models.                                                                                                                                                                    |
-| [WatsonxDocumentEmbedder](embedders/watsonxdocumentembedder.mdx)                                       | Computes embeddings for documents using IBM Watsonx models.                                                                                                                                                                                 |
+| [WatsonxDocumentEmbedder](embedders/watsonxdocumentembedder.mdx)                                       | Computes embeddings for documents using IBM Watsonx models.                                                                                                                                                                                 |