Skip to main content
The Vertex AI plugin lets you run Genkit on Google Cloud infrastructure. It offers the same Gemini models as the Google AI plugin plus additional enterprise features:
  • IAM-based access control and audit logging
  • VPC Service Controls and data residency options
  • Imagen image generation models
  • Vertex AI Vector Search for production-scale RAG
  • Model Garden (third-party models like Claude, Llama, and more)
In TypeScript, Vertex AI is now exported from the same @genkit-ai/google-genai package as googleAI. The legacy @genkit-ai/vertexai package still exists but is deprecated. In Go and Python, both GoogleAI and VertexAI live in the same package.

Installation

npm install @genkit-ai/google-genai
The older @genkit-ai/vertexai package is deprecated. Migrate to @genkit-ai/google-genai to avoid breaking changes in future releases.

Authentication

Vertex AI uses Google Cloud credentials, not API keys.
1

Install the Google Cloud CLI

Follow the official install guide for your platform.
2

Authenticate locally

gcloud auth application-default login
This writes Application Default Credentials (ADC) to your local machine. On GCP (Cloud Run, GKE, Cloud Functions) credentials are provided automatically by the metadata server.
3

Set your project and region

export GOOGLE_CLOUD_PROJECT=my-project-id
export GOOGLE_CLOUD_LOCATION=us-central1
4

Enable the Vertex AI API

gcloud services enable aiplatform.googleapis.com
Vertex AI Express Mode lets you use an API key instead of full GCP credentials — great for quick experiments. Get a key from Vertex AI Studio and pass it as apiKey. You don’t need to set projectId or location in Express Mode.

Configuration

import { genkit } from 'genkit';
import { vertexAI } from '@genkit-ai/google-genai';

const ai = genkit({
  plugins: [
    // Standard ADC authentication
    vertexAI({ location: 'us-central1' }),

    // Global endpoint (for Gemini 2.5+)
    // vertexAI({ location: 'global' }),

    // Vertex AI Express Mode
    // vertexAI({ apiKey: process.env.VERTEX_EXPRESS_API_KEY }),
  ],
});

Plugin options (TypeScript)

interface VertexAIPluginOptions {
  /** Google Cloud project ID. Defaults to GOOGLE_CLOUD_PROJECT env var. */
  projectId?: string;
  /** GCP region, e.g. 'us-central1' or 'global'. */
  location: string;
  /** Custom Google Auth options (scopes, credentials, etc.). */
  googleAuth?: GoogleAuthOptions;
  /** Additional Gemini/Vertex model refs to pre-register. */
  models?: (ModelReference<GeminiConfigSchema> | string)[];
  /** Enable detailed API request/response debug traces. */
  experimental_debugTraces?: boolean;
}

Generating text

const response = await ai.generate({
  model: vertexAI.model('gemini-2.5-pro'),
  prompt: 'Summarise the Vertex AI documentation in three bullet points.',
});
console.log(response.text);

Available models

Model nameBest for
gemini-2.5-proHighest capability, complex reasoning
gemini-2.5-flashFast, cost-effective
gemini-2.5-flash-liteUltra-low latency
gemini-2.0-flashPrevious-generation, widely available
imagen-4.0-generate-001Photorealistic images
imagen-4.0-fast-generate-001Faster image generation
imagen-4.0-ultra-generate-001Highest quality image generation
lyria-002Music generation (Vertex AI only)
Go and Python plugins use dynamic model discovery — any model supported by the underlying google.golang.org/genai or google-genai SDK is automatically available.

Text embeddings

const embeddings = await ai.embed({
  embedder: vertexAI.embedder('text-embedding-005'),
  content: 'This is a document about retrieval-augmented generation.',
});
Available embedders: text-embedding-004, text-embedding-005, text-multilingual-embedding-002, gemini-embedding-001, multimodalembedding@001.

Image generation (Imagen)

const response = await ai.generate({
  model: vertexAI.model('imagen-4.0-generate-001', {
    numberOfImages: 1,
    aspectRatio: '16:9',
    personGeneration: 'dont_allow',
  }),
  prompt: 'An aerial view of the Swiss Alps at golden hour.',
});

const image = response.media();

Music generation (Lyria)

Lyria is a Vertex AI–only model for instrumental music generation:
const response = await ai.generate({
  model: vertexAI.model('lyria-002'),
  prompt: 'An upbeat acoustic guitar piece for a travel vlog.',
});

const audio = response.media();

Vertex AI Vector Search for RAG

Vertex AI Vector Search provides managed, enterprise-scale nearest-neighbour search. The Go and Python plugins include a Vector Search retriever.
import "github.com/firebase/genkit/go/plugins/vertexai/vectorsearch"

// Define a retriever backed by a Vertex AI index endpoint
retriever := vectorsearch.DefineRetriever(g, vectorsearch.Config{
  IndexEndpointID: "projects/my-project/locations/us-central1/indexEndpoints/123",
  IndexID:         "projects/my-project/locations/us-central1/indexes/456",
  Embedder:        genkit.LookupEmbedder(g, "vertexai/text-embedding-005"),
})

docs, err := genkit.Retrieve(ctx, g, retriever,
  ai.WithQuery(ai.DocumentFromText("What is RAG?", nil)),
  ai.WithOptions(&vectorsearch.RetrieveOptions{Limit: 5}),
)

Advanced model configuration

Use Google Search grounding, code execution, and thinking budget:
import { vertexAI } from '@genkit-ai/google-genai';

const response = await ai.generate({
  model: vertexAI.model('gemini-2.5-flash', {
    // Enable Google Search grounding
    googleSearch: true,
    // Enable code execution
    codeExecution: true,
    // Thinking budget (0 = disabled, -1 = automatic)
    thinkingConfig: { thinkingBudget: 1024 },
  }),
  prompt: 'Calculate the 100th prime number and verify it.',
});

Deploying to Cloud Run

When deployed to Cloud Run, Cloud Functions, or any GCP service, credentials are automatically inferred from the runtime service account. No additional configuration is needed.
# No credential setup needed — GCP provides credentials automatically
FROM node:20-slim
COPY . .
RUN npm ci
CMD ["node", "dist/index.js"]
See Cloud Run deployment for a full walkthrough.

Google AI plugin

Same models via the Gemini Developer API — no GCP project needed.

Firebase plugin

Firestore vector search and Firebase telemetry.

RAG guide

Build retrieval-augmented generation pipelines.

Cloud Run deployment

Deploy Genkit flows to Cloud Run.