Startup technical guide AI agents
如果无法正常显示,请先停止浏览器的去广告插件。
1. Startup
technical
guide
AI agents
2. Table of
contents
Introduction 01
Core concepts of AI agents 02
An overview of Google Cloud’s agent ecosystem 04
Key components of every agent 09
The role of grounding in agentic systems 17
Key takeaways 23
How to build AI agents 25
A complete toolkit for building AI agents 27
A step-by-step guide: Defining an LLM agent 40
Govern and scale your agent workforce with Google Agentspace 43
Other options for building agents 45
Key takeaways 46
Ensuring AI agents are reliable and responsible AgentOps: A framework for production-ready agents 48
50
Build responsible and secure AI agents with AgentOps 54
Key takeaways 56
More from Google’s full AI stack 58
Conclusion 59
Resources 60
3. Introduction
The development of AI agents represents a paradigm shift in
software engineering, enabling startups to automate complex
workflows, create novel user experiences, and solve business
problems that were previously technically infeasible. Whether you’re validating an idea, building an MVP,
or supporting a product in production, this guide
will help across all stages of your project.
But moving from a promising prototype to a production-ready
agent means solving a new set of challenges. How do you
manage their non-deterministic behavior? How do you verify
their complex reasoning paths? And, crucially, where do you
get started? How to use this guide
This technical guide will help answer questions like these.
It provides a systematic, operations-driven roadmap for
navigating the new landscape, and is geared to help startups
and developers who are racing to embrace the potential
of agentic systems. Ready to build?
You’ll learn the foundational concepts of agentic systems,
from their core architectural components to the principles
that ensure reliable and responsible operation in production.
And you’ll learn about the full spectrum of tools that make
building and using agents on Google Cloud more efficient,
from code-first development with Agent Development Kit
(ADK) and operational automation with the Agent Starter Pack,
to no-code agent creation with Google Agentspace. Dive into
New to AI agents?
Start with
Jump to
Section 1
Section 2
for the core concepts.
to create your first agent using ADK.
Agent built?
Section 3
to make it safe, stable, and scalable.
Want extra support?
Use the Gemini Kit to prototype faster, and apply
to the Google for Startups Cloud Program to receive
expert guidance and up to $350k USD in cloud credits.
The focus of this guide
The agentic AI ecosystem offers many tools, libraries,
and approaches for building cognitive architectures.
There are open-source frameworks from Google like
Genkit and Google Cloud’s conversational AI offerings,
as well as popular open-source libraries like LangChain
and CrewAI.
This guide focuses primarily on ADK, sharing concepts
and architectural patterns that allow you to build robust,
scalable agents on Google Cloud while retaining the ability
to integrate other preferred tools and libraries.
1
4. Section 1
Core concepts
g
of AI agents
5. Section 1: Core concepts of AI agents
Section 2
Section 3
The field of agentic AI is evolving rapidly.
This section provides foundational knowledge
on AI agents, explaining their core concepts,
purpose, and operational mechanics. It also
details the relevant tools and services available
within Google Cloud.
Prefer audio? Listen to the podcast version
of this section, created with NotebookLM.
This podcast was created using NotebookLM with the
prompt: “As a podcast host, create a conversational and
educational podcast for ‘Startup technical guide: AI agents,’
targeting a technical audience of startup founders and
developers. The podcast must cover the three main paths
for using AI agents (build, use, partner), detailing tools
like the Agent Development Kit (ADK) and pre-built
Gemini agents.
“It should then explain the core components of an agent,
including models, tools, orchestration, and runtime. Also,
cover how to ensure trust and power through techniques
like grounding with Retrieval-Augmented Generation (RAG)
and leveraging multimodality. Conclude with a summary
of the key takeaways and a clear call to action directing
listeners to Google’s resources.”
3
6. Section 1: Core concepts of AI agents
1.1
Section 2
Section 3
An overview of Google
Cloud’s agent ecosystem
The agentive workflow is the next frontier.
It’s not just about asking a question and
getting an answer. It’s about giving AI a
complex goal—like ‘plan this product launch’
or ‘resolve this supply chain disruption’—
and having it orchestrate the multi-step
tasks needed to achieve it. This will
fundamentally change productivity.”
Thomas Kurian
CEO of Google Cloud
Build your
own agents
Building production-grade AI agents requires more than
selecting a large language model. A complete solution
demands scalable infrastructure, robust data integration
tooling, and architectural patterns that accommodate
diverse technical requirements.
Google Cloud supports the comprehensive development
of agentic systems, whether you’re building your own agents,
using pre-built Google Cloud agents, or bringing in partner
agents. Underpinned by the Model Context Protocol (MCP)
and Agent2Agent (A2A) protocol, this common framework
is designed for interoperability. This way, regardless of their
origin or architecture, your agents can collaborate within
the Google Cloud ecosystem. 1
Use Google Cloud
agents
Bring in
partner agents
Interoperability with MCP and A2A protocol
1. MCP and A2A protocol are covered in depth in section 2 of this guide.
4
7. Section 1: Core concepts of AI agents
Section 2
Section 3
Build your own agents
If you’re looking to build custom agents geared to tackle
specific tasks, then this is the route for you. Here, you’ve got
two options: a code-first approach for maximum control
or an application-first approach for accelerated development.
Agent Development Kit for custom,
code-first development
This approach is best for developers, technical startups,
and teams that require a high degree of control over agent
behavior. Google Cloud’s Agent Development Kit (ADK)
is built for this custom approach.
ADK empowers developers to build, manage, evaluate,
and deploy AI-powered agents. It provides a robust and
flexible environment for creating both conversational
and non-conversational agents, capable of handling complex
tasks and workflows.
Agents built with ADK can easily be deployed on Vertex AI
Agent Engine, a managed, scalable environment designed
specifically for this purpose. Because these agents are
containerized, they can also be deployed to any environment
that runs containers, such as Cloud Run and Google
Kubernetes Engine (GKE).
Core capabilities
• Orchestration logic: The agent’s core reasoning process,
like the ReAct framework (see section 1.2), allows it
to plan and execute a sequence of tool calls and actions
to accomplish a complex goal.
• Tool definition and registration: An interface for defining
custom functions and APIs, allowing the agent to interact
with data, APIs, and external systems.
• Context management: A system that provides the agent
with memory, allowing you to use the agent to recall user
preferences and conversational history across multiple
interactions to provide a coherent experience.
• Evaluation and observability: A suite of built-in tools
to rigorously test agent quality, debug the agent’s
step-by-step reasoning, and monitor its performance
in a production environment.
• Containerization: The capability to package the agent
into a standard, portable container, making it ready
for deployment on any compatible cloud environment.
• Multi-agent composition: The ability to build systems
where multiple specialized agents can collaborate,
delegate tasks, and work together to solve a problem.
Why it matters for startups
• Automate workflows, not just conversations:
Implement multi-step orchestration logic to solve complex
business problems, creating the operational leverage
a small team needs to scale.
• Build a defensible product: Connect agents directly to
your proprietary APIs and internal data to create a product
with a real competitive moat.
• Remember your customers to deliver a truly
personal experience: Seamlessly integrate short-term
conversational context with long-term knowledge,
enabling you to have your agent recall past interactions
and build a true customer relationship.
• Launch with confidence: Leverage built-in evaluation
and observability to rigorously test and debug your agent,
ensuring you ship a reliable, production-grade product.
• Focus on your product, not infrastructure: Package your
agent into a standard container for a faster, more reliable
path to production using standard DevOps practices.
8. Section 1: Core concepts of AI agents
Section 2
Section 3
Google Agentspace for
application-first development Use Google Cloud agents
The second primary pathway for building is through Google
Agentspace. Unlike the code-first ADK, you can use Google
Agentspace to orchestrate your entire AI workforce and
empower non-technical team members to build custom
agents using a no-code designer. With rapid prototyping and easy ways to integrate AI into your
existing apps, managed agents let you focus on core business
logic rather than managing infrastructure. They’re also ideal
if your engineering resources are limited.
This platform-based approach is ideal for managing multiple
agents and scaling their use across your mature startup’s
growing cohort of SaaS applications.
Core capabilities
• Unified company-wide search: Connects to and
searches across multiple SaaS applications.
• Multimodal data synthesis: Understands and synthesizes
information from text, images, charts, and video while
respecting data permissions.
• Pre-built agent library: Provides a suite of ready-
to-use agents for complex tasks like deep research
or idea generation.
• No-code custom agent builder: Includes Agent Designer,
which allows non-technical users to create agents
via a prompt-driven interface.
Why it matters for startups
Gemini Code Assist
Gemini Code Assist is an AI-powered assistant for developers.
It integrates into multiple points of the software development
lifecycle, providing assistance through IDE extensions, a
command-line interface, GitHub integration, and within various
Google Cloud services.
Core capabilities
• IDE integration: Within popular IDEs (VS Code, JetBrains
IDEs, Android Studio), it provides code completion, on-
demand function generation, and a chat interface. It uses
Gemini’s large context window to provide responses
relevant to the open codebase. Enterprise editions can
be connected to private source code repositories for more
customized suggestions.
• Command-line interface: Gemini CLI is an open-source
AI agent that brings Gemini capabilities directly to the
terminal for tasks such as code understanding, file
manipulation, and dynamic troubleshooting.
• Break down data silos: Non-developer teams can build
and deploy agents that can access and act across these
fragmented data sources and applications. • GitHub integration: On GitHub, Gemini Code Assist can
automatically review pull requests to identify bugs and
style issues, suggesting specific code changes.
• Automate workflows: Create cross-platform workflows
without consuming scarce engineering resources,
freeing up your engineering team to focus on core
product development. • Agent-driven development: Deploys AI agents capable
of performing complex, multi-file edits across a full
project’s context. These agentic workflows incorporate
Human in the Loop (HITL) oversight and can integrate
with ecosystem tools that follow the MCP.
Making Gemini a world model is a critical
step in developing a new, more general,
and more useful kind of AI—a universal
AI assistant. This is an AI that’s intelligent,
understands the context you are in, and
that can plan and take action on your
behalf, across any device.”
• Google Cloud service integration: Provides AI assistance
directly within services like Firebase (app error analysis,
performance insights), Colab Enterprise (Python code
generation), BigQuery (natural language to SQL, query
optimization), Cloud Run, and Apigee.
Demis Hassabis
CEO of Google DeepMind
6
9. Section 1: Core concepts of AI agents
Section 2
Section 3
Why it matters for startups
Gemini Code Assist acts as a force multiplier. It can handle
software development tasks across the development lifecycle,
from routine tasks like writing boilerplate code to more
complex operations like multi-file refactoring.
You can delegate a wide range of tasks to Gemini Code Assist.
Here are a few examples that show its capabilities.
• For automating boilerplate: Generate a Python Cloud
Function that triggers on an HTTP request. It should parse
a JSON payload for a userId and documentId, then
use the google-cloud-firestore client library to fetch
a specific document from a 'users' collection and return
it as a JSON response.
• For comprehensive testing: Provide one of your existing
functions and ask Code Assist to generate a complete test
suite, including the necessary mocks for Google Cloud
services like Cloud Storage or Firestore.
• For large-scale, Gemini-driven refactoring: Ask it
to analyze multiple services across your codebase and
generate a strategic plan. For example: “Given our 'user-
service' and 'auth-service', propose a step-by-step plan
to refactor the authentication logic into a single, shared
library, outlining the trade-offs of this approach.”
Gemini Cloud Assist
Gemini Cloud Assist is an AI expert for your Google Cloud
environment, providing context-aware assistance for
infrastructure management and application operations.
It uses context from your project, including Google Cloud
project IDs and the specific product page being viewed
in the console, to tailor its support. 2
Core capabilities
• Design and deploy: Within the Application Design Center,
you can describe desired infrastructure outcomes
in natural language. Gemini Cloud Assist generates
architecture diagrams and application templates, which
can be exported as Terraform for integration with existing
Infrastructure as Code (IaC) workflows.
• Troubleshoot and resolve: Integrates with Cloud
Observability to summarize complex log entries and
explain error messages. For deeper issues, you can launch
investigations, where Gemini analyzes logs and metrics
to identify the root cause.
2. For details on how Gemini Cloud Assist is grounded,
see the official documentation.
• Configure and optimize: Provides personalized cost and
utilization recommendations within the FinOps Hub as well
as the Cost Optimization dashboard.
• Secure and analyze: Enables natural language investigation
of network flows and logs. It provides guidance on security
tasks such as data encryption, secrets management, and
generating or testing custom organization policies. It can
also recommend IAM roles and diagnose permission errors.
Why it matters for startups
• Free up time: Cloud management can eat up engineering
time. Gemini Cloud Assist frees you up to focus on building
your product.
Try these prompts in Gemini Cloud Assist:
How do I use Vertex AI to deploy a model?
Create a high-level plan for designing, building,
and deploying a web app in Google Cloud.
List all Cloud Storage buckets in the prod-v1 project
that do not have Object Versioning enabled.
What are the public-facing firewall rules applied
to instances with the network tag external-web-server?
Show me all IAM roles granted to the service account
data-pipeline@my-project.iam.gserviceaccount.com
10. Section 1: Core concepts of AI agents
Section 2
Section 3
Gemini in Colab Enterprise
If your startup is working in data science, machine
learning, or analytics, Gemini in Colab Enterprise turns
every notebook into a collaborative AI workspace.
It’s built to generate, explain, and debug Python code
all in context.
Core capabilities
• Autocomplete and generate Python code within Colab.
• Explain code logic and errors in simple language.
• Filter, transform, and visualize data.
• Recommend public datasets and research resources.
• Summarize entire notebooks or code cells.
Try these prompts in Gemini in Colab Enterprise:
How do I filter a Pandas DataFrame?
Why it matters for startups
• Accelerate research and development:
Automate the most tedious aspects of data preparation,
analysis, and visualization, allowing developers to iterate
on new models and ideas significantly faster.
• Lower the barrier to entry: Engineers new to
data science can hit the ground running, while
experienced practitioners can focus more on model
experimentation and less on data wrangling.
Bring in partner agents
If your use case is more specialized, you can easily
integrate third-party or open-source agents into your
stack using Google Cloud’s open ecosystem and via the
Google Cloud Marketplace.
Explore Agent Garden to deploy pre-built ADK agents that
already support data reasoning and inter-agent collaboration.
You can mix and match them with the agents you build,
speeding up time to impact.
Plot average revenue by region.
Show me a list of publicly available datasets for climate tech.
Summarize the goal of this notebook.
8
11. Section 1: Core concepts of AI agents
1.2
Section 2
Section 3
Key components
of every agent
Models: Selection and tuning
Think of the model as your agent’s brain. You can use
the model to read user requests, figure out what needs
to happen, and generate smart responses.
Use cases
Early-stage prototyping and at-scale tasks
• Model profile: A lightweight, low-cost model
like Gemini 2.5 Flash-Lite.
How to choose the right model
Choosing the right model is not about selecting the most
powerful one available, but about finding the optimal balance
of capability, speed, and cost for your use case. Every model
can be evaluated on these three conflicting characteristics,
and the goal is to identify the most efficient option for
a specific job.
• Rationale: This is the most cost-efficient
and fastest 2.5 model, excelling at high-volume,
latency-sensitive tasks like translation and
classification.
High-volume, high-quality applications
• Model profile: A balanced, mid-range model
like Gemini 2.5 Flash.
As a model’s capability increases, its cost and latency
generally increase as well. The most common mistake
is over-investing in capability when a use case doesn’t need
it, leading to inefficient spending and slower performance.
The optimal strategy is to select the most efficient model
for any given task.
This principle is most powerfully applied at a system level.
Robust cognitive architectures employ multiple specialized
agents, each dynamically selecting the leanest model
for its specific sub-task. This ensures, for instance, that a
heavyweight model is reserved for complex reasoning, while
a lightweight model handles routine queries. This multi-agent
approach provides the architectural flexibility to optimize
the cost and performance of the entire system, not just
a single component.
• Rationale: This model is designed to control
the trade-off between quality, cost, and speed.
It delivers strong performance on complex tasks
at a lower price point than Pro, making it perfect
for production applications that need to be both
smart and economical.
Complex, multi-step reasoning and
frontier code generation
• Model profile: An advanced reasoning model
like Gemini 2.5 Pro.
• Rationale: This is the most capable model for the most
difficult tasks where performance is non-negotiable. 3
AI agents are systems that combine the
intelligence of advanced AI models with
access to tools so they can take actions
on your behalf, under your control.”
Sundar Pichai
CEO of Google and Alphabet
3. Gemini 2.5 Pro achieved state-of-the-art results on frontier benchmarks for coding
(82.2% on Aider Polyglot) and reasoning (86.4% on GPQA diamond). Results as of June 2025.
9
12. Section 1: Core concepts of AI agents
Section 2
Section 3
You can use the Gemini 2.5 family of models to break down
problems, formulate plans, and use tools. This reasoning
process is configurable. By allocating more reasoning
tokens to a specific call, a developer can direct the model
to expend more computational effort, directly trading
a predictable increase in latency and cost for a potential
increase in accuracy.
This token-level control, combined with model selection and
configurable reasoning modes, gives developers a dynamic
set of levers for sophisticated optimization. The cost and
performance of an entire multi-agent system can be calibrated
to meet specific business and technical requirements.
Pro tip
Use Model Garden on Vertex AI to discover, customize,
and deploy foundation models from a single, centralized
platform. It provides a curated selection of over 200
models from Google, partners like Anthropic, and a wide
variety of open models from providers like Meta (the
Llama family) and Mistral. Instead of manually managing
infrastructure, you can deploy models to applications
with a single click and scale them using built-in,
end-to-end MLOps capabilities.
10
13. Section 1: Core concepts of AI agents
Section 2
Section 3
Model tuning
Once you select a model that fits your cost-latency-quality
needs, you may have the option to fine-tune it. This specializes
its knowledge and style for your specific business needs,
and is done using a curated dataset of your own
high-quality examples.
The availability of fine-tuning is determined on a model-by-
model basis. Within Google’s model portfolio, this capability is
supported for the Gemma family of open-weight models and
for specific versions of Gemini. It is important to review each
model’s documentation and license agreement to confirm that
fine-tuning is both permitted and technically supported.
Pro tip
To see which models can be fine-tuned on Vertex AI,
check the official documentation.
Tools: Enabling agentic action
Use case
Fine-tuning a support agent
Say you’re building a customer support agent for
your SaaS product. You could fine-tune on a dataset
of thousands of past support tickets and their ideal
resolutions to help the model learn about common
issues and respond in a voice that aligns with your
support team’s style.
Note
Fine-tuning is not grounding. Fine-tuning adapts
a model’s style and refines its knowledge on a specific
task. Grounding connects the model to real-time,
verifiable data sources to ensure its responses are
factually accurate. Model grounding is discussed
in detail below.
Tools are defined capabilities that enable an agent to do more
than the native functions of its core reasoning model, from
performing a simple, internal calculation to interacting with
external systems via API calls. They bridge the gap between
the agent’s reasoning and its ability to retrieve new information
or execute stateful operations.
Tools can include a wide variety of components:
• Internal functions and services: Proprietary logic written
by your own team.
• APIs: Connections to both internal services and external,
third-party services.
• Data sources: The ability to query databases, vector stores,
or other repositories of information.
• Other agents: In a multi-agent system, one agent can
use another specialized agent as a tool.
11
14. Section 1: Core concepts of AI agents
Section 2
Section 3
Data architecture for agentic systems
Data serves as the basis for an agent’s short-term and long-term memory. A robust data architecture must address three
distinct needs: persistent storage for long-term knowledge retrieval, low-latency access for short-term conversational
context, and a durable ledger for transactional auditing. By mapping specific Google Cloud services to each of these needs,
you can ensure that every architectural decision is both cost-effective and scalable, while addressing immediate business
needs and time-to-market goals.
1. Long-term knowledge base (grounding and retrieval)
An agent’s long-term memory is the foundation for its intelligence, grounding, and personalization. It is distinct from
the fast, short-term context of a live conversation. A robust long-term memory architecture should comprise three core
components: a structured knowledge base for fact-based grounding via retrieval-augmented generation (RAG); a persistent
store for user interaction history to enable a continuous, personalized experience; and, an operational data lake for raw
material like conversation transcripts and workflow states, for more complex cognitive processes and future analytics.
Data service Overview Startup use cases
Vertex AI Search A managed service for building high-performance
vector search applications. It is the primary tool for
enabling semantic understanding and retrieval over
large, unstructured datasets. Instantly find answers within your internal product
documentation, customer support chat logs, and
community forum posts, so your agent can provide
accurate, context-aware support to new users.
This reduces the burden on your small support team.
Firestore A serverless NoSQL document database with real-time
synchronization capabilities. Its flexible, hierarchical data
model is well-suited for storing structured context and
the dynamic state of an agent’s long-running or durable
state active tasks. Maintain the real-time state of a multi-step, agent-
guided user onboarding flow. As the user completes
each step (e.g., “create profile,” “connect API,” “invite
team member”), the agent updates a Firestore
document. Developers can then observe the agent’s
task progress in real-time, and the user can seamlessly
resume the process across sessions.
Vertex AI Memory
Bank (Preview) A managed service on Vertex AI Agent Engine
specifically designed to dynamically generate,
store, and retrieve long-term memories from user
conversations. Instead of manually building logic to extract user
preferences, an agent can automatically call
GenerateMemories on a conversation history.
This asynchronously extracts key facts (e.g., “user
prefers non-stop flights,” “user’s dog is named Fido”)
and stores them. In future sessions, the agent can
retrieve these memories via similarity search to provide
a deeply personalized and continuous experience with
minimal custom code.
Cloud Storage A highly scalable and durable object store for raw,
unstructured source data (e.g., PDFs, images, videos)
that feeds into other services for indexing and processing. It serves as the durable, low-cost landing zone for all
user-uploaded documents, images of bug reports,
or audio recordings of customer feedback calls. This
raw data is then processed and indexed by services
like Vertex AI Search to enrich your agent’s knowledge.
BigQuery A fully-managed, serverless data warehouse for storing
and analyzing massive structured and semi-structured
datasets, so agents can be equipped with tools that
execute complex analytical queries. An agent can ask questions like, “Summarize user
engagement patterns for the new feature we launched
last week,” or “Which customer cohorts have the highest
churn risk based on recent activity?” BigQuery provides
instant business intelligence.
12
15. Section 1: Core concepts of AI agents
Section 2
Section 3
2. Working memory (conversational context and short-term state)
This layer manages the transient information required for an ongoing task or conversation.
It must provide extremely low-latency access to maintain a responsive user experience.
Data service Overview Startup use cases
Memorystore A fully managed, in-memory data store providing
sub-millisecond latency. It is ideal for caching frequently
accessed data and managing session state. Its primary role is high-speed caching to store the
results of any computationally expensive or high-latency
operation. Instead of repeatedly executing a costly task
such as an LLM API call, a complex database query,
or a call to a third-party service, the agent first checks
Memorystore for a cached result. It drastically reduces
both response latency and recurring operational costs,
both critical to a startup’s agentic system.
3. Transactional memory (state management and action auditing)
This layer is responsible for recording actions and state changes with strong consistency and integrity.
It serves as the system of record, often requiring ACID guarantees to ensure reliability.
Data service Overview Startup use cases
Cloud SQL A fully managed service for traditional relational
databases that provides strong consistency for single-
region transactional workloads. It is the standard choice
for reliable state management. When an agent successfully executes a critical business
action, such as processing a subscription payment
or provisioning a new service for a user via an API call,
it writes a record to a Cloud SQL database. This creates
a permanent, ACID-compliant audit log, ensuring
every important agent-driven action is reliably
tracked and verifiable.
Cloud Spanner A globally distributed, strongly consistent relational
database offering horizontal scalability. It is designed for
mission-critical applications that require high availability
and transactional integrity across geographic regions. A startup would typically migrate from Cloud SQL to
Spanner only after achieving product-market fit where
its user base becomes globally distributed. For example,
a travel or ecommerce app that initially used Cloud SQL
now needs to process bookings or orders from users
in North America, Europe, and Asia simultaneously
without data conflicts. Spanner’s global transactional
consistency supports this scale.
13
16. Section 1: Core concepts of AI agents
Section 2
Section 3
Agent orchestration:
The executive function
Orchestration is the operational core that guides an agent
through a multi-step task. For any process that requires
more than a single action, it determines which tools are
needed, in what sequence, and how their outputs should
be combined to achieve a final goal.
As the agent’s executive function, you can apply
the orchestration to be responsible for planning
and decision-making. And, by automating complex
business processes, it creates powerful leverage
for small startup teams.
Orchestration concepts
and cognitive architecture
A common and effective orchestration pattern is
ReAct (Reason + Action), a framework that synergizes the
reasoning and acting capabilities of large language models. 4
ReAct establishes a dynamic, multi-turn loop where the
model generates both reasoning traces (thoughts) and
task-specific actions in an interleaved manner. This allows
for greater synergy—reasoning helps the model track and
update action plans, while actions gather information from
external tools to inform the reasoning process.
Here’s how it works:
1. Reason: The agent assesses the goal and the current state,
forming a hypothesis about the next best step and whether
a tool is required.
2. Act: The agent selects and invokes the appropriate tool.
3. Observe: The agent receives the output from the tool.
This new information is integrated into the agent’s context
and feeds into the next Reason step of the cycle.
4. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023).
ReAct: Synergizing Reasoning and Acting in Language Models.
Published as a conference paper at ICLR 2023.
14
17. Section 1: Core concepts of AI agents
Section 2
Section 3
Example: Processing a refund with
ReAct orchestration
• Reason: The user wants a refund. The first step
is to understand the company’s rules for refunds.
• Act: Use the semantic_search tool to query the internal
knowledge base with “refund policy.”
• Observe: The tool returns: “Full refunds are available
for all products within 30 days of the purchase date.”
• Reason: The policy requires the purchase date.
This information for the user’s specific order must
come from the CRM.
• Act: Call the get_order_details function from the
CRM tool with the user’s ID.
• Observe: The function returns an order object,
including purchase_date: ‘2025-07-20’.
• Reason: The current date is July 29, 2025. The purchase
was 9 days ago, which is within the 30-day window.
The criteria are met, and the refund can be initiated.
• Act: Call the process_refund tool with the order
ID and refund amount.
• Observe: The tool returns status: ‘success’.
• Final answer: “Your refund has been processed
successfully. You should see the amount credited
to your account within 3-5 business days.”
Use cases
• Automated customer onboarding: An agent could
be orchestrated to guide a new user through setup.
It might first use a tool to create a new account via an
API, then use a send_email tool to deliver a welcome
message, and finally use a tool to check the database
to confirm the user has completed their first action,
triggering a follow-up prompt if they haven’t.
• Proactive system monitoring and remediation:
An orchestration could be triggered by a monitoring
alert. First, the agent gets more context by using
a tool to query logs from Cloud Logging. Based on
the logs, it might then decide to use a kubectl tool
to restart a specific pod in GKE, and finally use
a slack_notification tool to report the action to the
on-call channel.
• Complex lead qualification: A sales agent could
be orchestrated to enrich a new lead’s email with
company data from an API. It would use a tool
to search the internal CRM to see if the lead is an
existing customer. It would then use the collected
information to decide whether to assign the lead
to a senior sales representative or add them to
a nurturing sequence.
Mastering orchestration is the key to moving beyond
simple, single-shot agents. When you get it right, you create
sophisticated, autonomous systems that can tackle problems
that, previously, were not technically feasible. It unlocks
a new class of applications and user experiences.
18. Section 1: Core concepts of AI agents
Section 2
Section 3
Runtime: Deploying agents
at scale
Deploying a functional agent prototype into a production
environment requires a robust runtime infrastructure.
The runtime facilitates agent deployment at scale, turning
a prototype into a reliable product that handles complex
operational requirements like security, load balancing,
and error handling, especially during periods of unpredictable
user growth.
Runtime concepts and architecture
A production-grade runtime environment for AI agents must
provide several core capabilities:
• Scalability: The infrastructure must automatically scale
to handle variable loads, from zero to millions of requests.
This includes both request-based load balancing and
resource-based autoscaling to manage computational
demands efficiently.
• Security: The runtime must provide a secure execution
environment, managing identity, network access controls,
and secure communication channels (e.g., TLS) to protect
the agent and the data it accesses.
• Reliability and observability: The system must include
mechanisms for error handling, automatic retries,
and comprehensive monitoring. This involves logging
agent actions and tool outputs, and collecting metrics
on performance and resource utilization to diagnose
and resolve issues.
Use cases
Your choice of runtime directly impacts operational
overheads and your ability to scale.
• Vertex AI Agent Engine: A seed-stage startup with
a small engineering team deploy their first customer
support agent, going from a working prototype
to a scalable, secure production endpoint in days
instead of weeks.
• Cloud Run: A startup experiencing rapid but
unpredictable growth for their new AI-powered
feature deploy their ADK agent on this serverless
architecture, so they only pay for compute
when the agent is actively processing requests.
It’s a cost-effective way to handle traffic spikes
without over-provisioning infrastructure.
• Google Kubernetes Engine (GKE): A Series B
startup with an established platform engineering
team and dozens of microservices decide to host
their new internal automation agent on their existing
GKE cluster. This way, they can use established
CI/CD processes, security policies, and monitoring
dashboards, ensuring the agent adheres to the
same operational standards as the rest of their
production services.
16
19. Section 1: Core concepts of AI agents
1.3
Section 2
Section 3
The role of grounding
in agentic systems
An agent’s credibility and usefulness depends on its ability
to provide accurate, trustworthy answers based on verifiable
facts, a process known as grounding. This section explores the
evolution of grounding techniques, providing a roadmap for
building increasingly sophisticated and reliable agents.
We begin with the foundational pattern of RAG, which grounds
an agent by retrieving text based on semantic similarity.
We then explore GraphRAG, which enriches grounding by
understanding the explicit relationships between data points
in a knowledge graph. Finally, we cover Agentic RAG, where
the agent is no longer a passive recipient of information but
an active, reasoning participant in the retrieval process itself,
capable of executing multi-step strategies to find the best
possible answer.
RAG: A foundational first step
The first step on the path to sophisticated grounding is the
architectural pattern of RAG. This approach enhances an LLM’s
responses by retrieving relevant information from an external
knowledge base before generating an answer. Instead of
relying solely on its pre-trained knowledge, the agent performs
a semantic search to find verifiable data, which is then passed
to the LLM as context. This ensures a baseline of grounded,
verifiable answers.
While foundational, this simple retrieve-then-generate process
treats knowledge as a flat collection of disconnected facts. It is
a powerful technique for direct question-answering, but it falls
short on complex queries that require a deeper understanding
of the relationships between data points.
Benefits of RAG for agentic systems
• Agents can access the latest information: The retrieved
info is more current than their last training date, enabling
more timely and relevant agentic behavior.
• Agents are more accurate: RAG significantly reduces the
risk of outputs that could lead to incorrect or inappropriate
agentic actions.
• Faster responses: Vector embeddings and specialized
databases enable lightning-fast semantic searches of
massive datasets, so agents can deliver more responsive,
timely decisions.
• More comprehensive agent awareness: The RAG
workflow, which consists of ingesting, parsing, chunking,
embedding, storing, and retrieving, can be applied to
text, images, and other types of data. With this deeper
understanding, agents can perform more complex,
multistep reasoning tasks.
Google Cloud’s managed, out-of-the-box RAG solution is
called Vertex AI Search. It simplifies the process of integrating
data sources, and can also use open source or third-party
tools. Vertex AI RAG Engine provides a data framework
for developing context-augmented LLM applications that
deliver accurate, controlled responses aligned with specific
knowledge and policies. This is ideal for critical startup
applications like customer support, internal knowledge
management, and compliance-related tasks.
17
20. Section 1: Core concepts of AI agents
Section 2
Section 3
Vertex AI RAG Engine at work
Tool
Use Vertex AI Search and Vertex AI RAG Engine
to ground responses using your proprietary content.
Vector databases:
Search by meaning
The ability to search by meaning, not just keywords,
is made possible by vector embeddings. These numerical
representations capture the conceptual essence of data
(like text and images), allowing a system to find relevant
information no matter how a question is phrased.
Vector databases are the infrastructure that makes this
possible at scale. They are highly specialized systems
designed to store, index, and query millions of these
embeddings with the extremely low latency required
for a responsive agentic system.
Here’s how it works:
1. Data is transformed into vector embeddings:
The ML model places semantically similar items
close together in a multidimensional vector space.
Pro tip
Use the check grounding API to verify whether the
AI’s answers are based on grounded, up-to-date info.
Use case
Enhancing customer support
A shoe company uses a vector database with semantic
search to power a customer support chatbot:
• Product descriptions, warranty information, and FAQs
are all converted into embeddings and stored.
• The vector database understands that “good for people
with wide feet” is semantically related to concepts like
“wide fit,” “extra wide,” or “comfortable for wide feet.”
• It retrieves relevant product recommendations and
provides a much better customer experience.
Compare this to if the shoe company used a traditional
database. A query using LIKE ‘%good for people with wide
feet%’ would fail to return any results because that exact
phrase does not exist in the database.
2. Storage and indexing: The vector database stores
these embeddings and builds specialized indexes
to enable fast and efficient similarity searches.
3. Querying: The user’s query is converted into an
embedding using the same model. The database
then finds the embeddings in its index that are closest
to the query embedding, effectively retrieving the
most semantically relevant information to ground
the model’s response.
18
21. Section 1: Core concepts of AI agents
Section 2
Section 3
GraphRAG: Smarter grounding
Use case
A medical AI assistant that needs to know
“symptoms
causes
treatments”
and not just retrieve related snippets.
GraphRAG builds a knowledge graph, so instead of just
matching similar phrases, your agent understands how
concepts relate.
The hierarchy of knowledge in GraphRAG
Application
users
Region
Data ingestion
subsystem
Data files
Embedding API
Vertex AI
Cloud Storage
bucket
Embeddings
File queue
Pub/Sub
Cloud Spanner
Gemini API
Vertex AI
File processor
Cloud Run
Cloud Run jobs
config
Embeddings
Knowledge graph
Knowledge
graph
Serving
subsystem
Agent
Agent Development Kit Embedding API
Vertex AI
Vertex AI
Agent Engine Gemini API
Vertex AI
Monitoring
Logging
IAM
19
22. Section 1: Core concepts of AI agents
Section 2
Section 3
Agentic RAG: Dynamic
reasoning and retrieval
The most powerful approach to grounding is Agentic RAG,
a technique that helps you transform the agent from a passive
recipient of retrieved data into an active, reasoning participant
in the search for knowledge. Following frameworks like ReAct,
the agent can analyze a complex query, formulate a multi-step
plan, and execute multiple tool calls in sequence to find the
best possible information.
A prime example of this agentic pattern is grounding with
Google Search. You can use the Gemini 2.5 family of models
to integrate advanced reasoning, allowing them to interleave
search capabilities with internal thought processes to answer
complex, multi-hop queries and execute long-horizon
tasks. The agent can help you handle the entire workflow
automatically: it analyzes the prompt, formulates and executes
precise search queries, and synthesizes a final, grounded
response with sources.
An agent built with the Gemini family of models moves
beyond simple content recognition to active, multi-step
problem-solving. For example, an agent could:
• Analyze a photo to identify a specific species of plant and
then autonomously retrieve its detailed care instructions.
The conventional wisdom was that
foundation model performance would
improve exponentially, but we are
reaching the inflection point where that
climb is plateauing and real differentiation
lies in specialization and context
engineering. Agentic RAG forms a central
pillar of the context layer, allowing AI
agents to iteratively find, retrieve, and
reason over ground truth data before
generating a final answer.
The future is multi-LLM: different
models for different tasks, connected
by a model- and data-agnostic context
layer that unlocks their full potential.”
Douwe Kiela
CEO of Contextual AI
• Process an audio stream from a support call to not
only transcribe the words but also determine the
customer’s sentiment, such as frustration, to escalate
the ticket properly.
This ability to perceive and reason across different data
types transforms the agent from a data processor into
a problem-solving tool that understands and interacts
with the world in a more complete way.
20
23. Section 1: Core concepts of AI agents
Section 2
Section 3
Example: Real-time inventory check
Define a function called check_inventory that takes
a product_ID and returns the current stock levels from
your real-time inventory system. Similarly, another
function, check_warranty_status, could take a product_ID
and return its warranty information directly from your
warranty management system.
Then, when a customer asks about a specific product’s
availability, the AI agent:
1. Identifies the product: It uses semantic search
(powered by the vector database) to accurately
identify the specific shoe model the customer
is asking about, even if they describe it vaguely.
3. Provides real-time response: The check_inventory function
executes, fetches the live stock data from your inventory
system, and returns it to the AI agent. The agent then
provides the customer with an immediate, accurate response
about availability.
This combination of retrieval (knowing what information
is relevant) and actions (performing real-time operations)
makes your AI agents smarter, faster, and much more useful.
Pro tip
Use Vertex AI and Google Cloud’s vector search to add
this to your agent.
2. Triggers the action: It recognizes the need for real-time
stock information and uses function calling to invoke
your check_inventory function.
“Are the new
SolarFlare runners
in stock?”
Functions Tools
product_lookup Vector database
check_inventory Inventory system
“Yes! The SolarFlare
runners are in stock.
We have 82 pairs
available right now.”
The Agentic RAG workflow on Google Cloud
Google Cloud provides managed services that handle the
entire Agentic RAG workflow:
• Generating embeddings and indexing: The first step is
to convert your data into vector embeddings. An option
is the Gemini Embedding model, which is available in
both Vertex AI and the Gemini API and supports over 100
languages. This and other models are part of the broader
Vertex AI Embedding APIs suite.
• Storing and indexing: The vector embeddings are then
stored and indexed in Vertex AI Vector Search. This is a
fully managed, high-performance vector database that
automatically builds the specialized indexes required for
fast and efficient similarity searches at scale.
Pro tip
Use the retrieve and re-rank approach.
Address the trade-off between recall (finding all relevant
documents) and precision (ensuring every retrieved
document is relevant) using the two-step “retrieve and
re-rank” approach (shown in this agentic_rag sample).
First, it widens the recall aperture by configuring Vertex
AI Vector Search to retrieve a larger-than-needed set of
candidate documents. Second, this larger set is passed
to the LLM or a specialized re-ranking service, which
identifies the most relevant documents and discards any
that are irrelevant or semantically opposite.
• Retrieval and reasoning: When a user submits a query,
it is converted into an embedding and used by Vertex AI
Vector Search to find the most relevant information. This
targeted context is then passed to the LLM to generate
the final, grounded response.
21
24. Section 1: Core concepts of AI agents
Section 2
Section 3
From retrieval to reasoning:
A strategic advantage Other grounding methods
Agentic RAG represents a fundamental leap, moving beyond
simple information retrieval to genuine problem-solving.
By empowering the agent to be an active, reasoning participant,
developers can build systems capable of executing the
complex, multi-step queries and long-horizon tasks that define
next-generation agentic capabilities. While RAG is a foundational technique, Vertex AI offers
other ways to ensure your agents deliver accurate, reliable
responses. For example:
For a startup, mastering this approach is the key to building
a defensible, truly intelligent product that can unlock novel
workflows and user experiences.
• Grounding with Google Search: Connect your model
to world knowledge and a wide possible range of topics.
• Grounding with Google Maps: Use Google Maps data with
your model to provide more accurate and context-aware
responses to your prompts.
• Grounding Gemini to your data: Use RAG to connect
your model to your website data or your sets of
documents.
• Grounding Gemini with Elasticsearch: Use RAG
with your existing Elasticsearch indexes and Gemini.
Example: Grounding with Google Search
When you use the Google Search tool, the model handles the entire workflow of searching,
processing, and citing information automatically.
User
Gemini
Who won the Euro 2024?
Google Search tool
Decides to search or not
Query generation:
[“UEFA Euro 2024 winner”,
“who won euro 2024”]
Searches the web
Search results, snippets, metadata
Synthesizes answer using search results
Spain won Euro 2024, defeating
England 1:2 in the final
1. User prompt: The application sends the prompt
to the Gemini API with the Google Search tool.
2. Prompt analysis: The model analyzes the prompt and
determines if a Google Search can improve the answer.
3. Google Search: If needed, the model automatically
generates one or multiple search queries and
executes them.
4. Search results processing: The model processes
the search results, synthesizes the information,
and formulates a response.
5. Grounded response: The API returns a final,
user-friendly response that is grounded in the search
results. This response includes the model’s text answer
and grounding metadata containing the search queries,
web results, and citations.
22
25. Section 1: Core concepts of AI agents
Section 2
Section 3
Key takeaways:
Choosing your AI agent’s
components
Your goal
Best option
Choose the agent’s
core intelligence. Select a model (e.g., Gemini) based on your use case
and fine-tune it with your specific data.
Make your agent trustworthy
and factual. Use grounding techniques like RAG with a vector database
so it checks facts, not just guesses.
Manage a complex task
with multiple steps. Use orchestration to create a plan that determines which
tool to use, in what order, and how to combine the results.
Connect to public,
real-time data and services. Use pre-built extensions to easily plug into third-party APIs.
Connect securely to your
own internal tools. Write custom functions to give the agent controlled access
to your private databases, CRMs, or other internal systems.
Deploy a reliable and safe
product at scale. Use a managed runtime for scalable infrastructure, plus
built-in evaluation and safety tools to monitor performance
and block harmful content.
23
26. Section 1: Core concepts of AI agents
Section 2
Section 3
Ready to turn your
AI vision into reality?
We’re here to help.
Learn how to build more generative AI applications
with on-demand Startup School sessions.
Start now
Get up to $350k USD in Google Cloud credit
with the Google for Startups Cloud Program.
Apply now
Contact our Startups team.
Get in touch
Stay connected and get all our latest updates by
subscribing to the Google Cloud Startup Newsletter.
Subscribe
24
27. Section 2
How to build
AI agents
gents
28. Section 1
Section 2: How to build AI agents
Section 3
The previous section defined the fundamental
components of a modern agentic system: a core
reasoning model, a set of tools to enable action,
data architecture options to persist short-and
long-term agent memory, a grounding mechanism
to ensure factual accuracy, and deployment options.
With this conceptual framework established, we can now
shift our focus to how you build an agent. This section
provides an opinionated, practical guide to the architectural
decisions involved in building a production-ready agent.
Open and flexible, the Google Cloud ecosystem recognizes
the industry offers many excellent frameworks. Here,
however, we intentionally focus on ADK, a complete, robust
implementation that fits perfectly in the Google Cloud
ecosystem. We also make specific recommendations built
upon open industry standards like the Model Context
Protocol (MCP) and the Agent2Agent (A2A) protocol.
Prefer audio? Listen to the podcast version
of this section, created with NotebookLM.
Pro tip
Before building your agent, explore 601 real-world agent
use cases from the world’s leading organizations.
This podcast was created using NotebookLM with the
prompt: “As a podcast host, create a practical podcast,
targeting developers and technical founders. The podcast
should introduce Agent Development Kit (ADK) as a solution
to common agent-building challenges. It needs to detail
ADK’s core benefits and explain its primary agent types,
such as the intelligent LlmAgent and structured Workflow
Agents (e.g., Sequential, Parallel, and Loop).
“The podcast must then cover the broader ecosystem,
including the Model Context Protocol (MCP), Vertex AI
Agent Engine for deployment, and the Agent2Agent (A2A)
protocol for communication. Briefly mention alternative tools
like Google Agentspace, Firebase Studio, and Gemini CLI,
and conclude with a summary and a call to action directing
listeners to Google’s startup resources.”
26
29. Section 1
2.1
Section 2: How to build AI agents
Section 3
A complete toolkit
for building AI agents
When it comes to building a custom AI agent for your
startup, founders face a critical trade-off: development
velocity versus flexibility.
At one end, you have easy-to-use solutions like low-code
platforms or pre-built products. These are fast to implement
but give you less control, making them best for standard
problems. At the other end, you have highly flexible
frameworks or the option to build everything from scratch.
While these give you maximum ability to customize,
they require more significant development resources and
deep technical expertise. ADK sits in the middle of this
development landscape.
Explore ADK
Core components for building AI agents
Agent Development Kit Open-source, code-first toolkit for building,
evaluating, and deploying AI agents.
Model Context Protocol Open protocol that standardizes how
applications provide content to LLMs.
Vertex AI Agent Engine Managed platform to deploy, manage, and scale
AI agent in production.
Agent2Agent (A2A) protocol Open standard designed to enable communication
and collaboration between AI agents.
27
30. Section 1
Section 2: How to build AI agents
Section 3
Develop with ADK
If you need more power than a simple low-code tool but
want an accelerated development process, ADK gives you
the control to build a unique system of collaborative agents
while simplifying complex technical challenges. For example,
specific protocols like MCP and A2A make it easier for agents
to extend their capabilities (as we’ll explore below).
What’s more, ADK complements the tools your team may
already use and integrates with the broader Google Cloud
ecosystem, so you’re not locked into a single approach.
This flexibility extends to how you run your agents once they’re
built, with options to deploy to a fully managed service like
Vertex AI Agent Engine or to a versatile serverless platform
like Cloud Run. It’s all about choosing the right foundation for
your specific operational needs.
We are focused on building a flourishing
AI agent ecosystem. The open source
Agent Development Kit has had over a million
downloads in less than four months.”
Sundar Pichai
CEO of Google and Alphabet
Building complex workflows gets easier with ADK
2
1
User prompts agent
Agent
4
Model interprets tool B’s
output, generates the agents
response to the user
Model selects Tool B and
formats tool request body {}
(“function-calling”) Model
3 Tools
Agent framework
calls tool B {}
A
B
C
What you can do with ADK
1. Build complex, collaborative AI systems 2. Integrate AI into existing tools, agents, and workflows
ADK is multi-agent by design. It’s easy to build highly
specialized AI solutions that automate complex, multi-step
workflows. And, with flexible orchestration (sequential,
parallel, or dynamic), you can start with simple automations
and evolve to highly adaptive systems. For example, you
can build an intelligent project management system with
a “Task Breakdown Agent” that delegates sub-tasks to
specialized “Code Generation Agents,” “Design Agents,”
and “Documentation Agents”. ADK is built around a rich tool ecosystem that allows your
agents to interact with all your existing tools and data. You
can connect your agents to productivity tools you already
use, like Notion, Slack, or a CRM, as well as tool frameworks
like LangChain and LlamaIndex, or agent frameworks like
LangGraph or CrewAI. Tools can be shared with MCP and
the agents you create can be shared with A2A. This way, you
can inject AI intelligence into every facet of your operations,
enhancing the tools and systems you already have in place
without requiring a disruptive architectural overhaul.
28
31. Section 2: How to build AI agents
Section 1
Section 3
3. Ensure quality and reliability from day one 4. Scale AI with confidence
To earn users’ trust, it’s critical to carefully test and evaluate
your agents before deploying them to production. ADK’s built-
in observability and evaluation tools help you: As you grow, your AI solutions need to scale seamlessly
without becoming a bottleneck. ADK accelerates the path
to production by using AgentOps (described in detail below)
to bridge the gap between local development and deployment.
The framework exposes agents as standard web services
using FastAPI, which can then be containerized. This frees
your developers from having to build custom deployment
infrastructure. They can deploy anywhere, from local testing
to fully managed, auto-scaling runtimes like Vertex AI Agent
Engine or Cloud Run.
• Rapidly iterate: Before deploying, you can systematically
test how your agents respond to various scenarios and
how well they execute complex tasks.
• Debug agent behavior: Inspect the full execution trace
of your agent including its reasoning (thoughts), tool
calls, and observations to understand its decision-making
process and debug complex, multi-step workflows.
• Benchmark your agents: Evaluate different agent
designs or model updates against predefined metrics,
using a data-driven approach to continuously improve
agent performance.
This moves your process beyond simple “vibe-testing,”
allowing you to launch professional-grade agents quickly,
build user trust through proven reliability, and iterate with
data-driven confidence.
ADK core: Agent architectures
A foundational step in building with ADK is selecting the right
agent architecture. Distinct agent classes are designed for
different execution patterns, and your choice will determine
how your agent reasons and operates. It’s typically a trade-off
between the flexible, non-deterministic power of an LLM and
the predictable, deterministic control of hard-coded logic.
Understanding the interplay between these agent classes
is key to building robust and effective AI systems.
ADK’s agent types are organized into three categories
BaseAgent
Extended by
1
Extended by
LLM-based
LlmAgent (Reasoning,
tools, transfer)
2
SequentialAgent
Extended by
Workflow agents
ParallelAgent
3
LoopAgent
Custom logic
CustomAgent
Primary agent types
29
32. Section 1
1
Section 2: How to build AI agents
LLM agent (LlmAgent)
Core engine: LLM
Determinism: Non-deterministic
(flexible)
2
Section 3
This is the most common agent type and is commonly referred to as simply “Agent.”
It uses an LLM like Gemini for complex reasoning, dynamic decision-making, and natural
language understanding; and it forms the core of most conversational and problem-
solving agents.
Workflow agent (SequentialAgent, ParallelAgent, LoopAgent)
Core engine: Predefined logic
Determinism: Deterministic
(predictable)
These orchestrators deterministically control how other agents execute in predefined
patterns. They’re used for structured processes.
Sequential agents (SequentialAgent)
Executes sub-agents in a fixed order, passing the output of one as the input to the next.
Agent A completes Task 1, then passes its output to Agent B for Task 2.
SequentialAgent
sub_agents_1
sub_agents_2
Input
Output
Arrows indicate execution and not
state share (communication)
Example: You want to build an agent that can summarize any webpage, using two tools:
Get Page Contents and Summarize Page. Because you can’t summarize from nothing,
the agent must always call Get Page Contents before calling Summarize Page.
Get the full code
30
33. Section 1
Section 2: How to build AI agents
Section 3
Parallel agents (ParallelAgent)
This executes multiple sub-agents simultaneously. This is used for performance
optimization when tasks are independent. Agent A and Agent B work on independent
sub-tasks simultaneously, and their results are combined.
ParallelAgent
Input
sub_agents_1 Output_1
sub_agents_2 Output_2
sub_agents_3 Output_3
Example: For operations like multi-source data retrieval or heavy computations,
parallelization yields substantial performance gains. Importantly, this strategy assumes no
inherent need for shared state or direct information exchange between the concurrently
executing agents.
Get the full code
Loop agents (LoopAgent)
A workflow agent executes its sub-agents in a loop (iteratively). It repeatedly runs
a sequence of agents for a specified number of iterations or until a termination condition
is met. Use the LoopAgent when your workflow involves repetition or iterative refinement,
such as revising code.
LoopAgent
sub_agents_1
sub_agents_2
Exit condition met
Input
sub_agents_3
Output
sub_agents_4
Loop
max_iterations=2
Example: You want to build an agent that can generate images that contain specific
amounts of food (e.g., five bananas), using two tools: Generate Image, Count Food
Items. Because you want to keep generating images until it either correctly generates
the specified number of items, or after a certain number of iterations, you should
build your agent using a LoopAgent.
Get the full code
31
34. Section 1
3
Section 2: How to build AI agents
Section 3
Custom agent (BaseAgent subclass)
Core engine: Custom Python code Custom agent (BaseAgent subclass)
Determinism: Can be either, based
on implementation For unique requirements and tailored workflows that go beyond a standard reasoning
loop, you can create a custom agent by inheriting directly from BaseAgent and writing
custom Python logic to control its behavior. This approach is necessary when an agent’s
actions are not determined by an LLM, but by specific, hard-coded rules.
How it’s done: A developer creates a new Python class that inherits from the BaseAgent
class. They must then implement the _run_async_impl method, which contains the
unique logic the agent will execute. This method has full access to the session’s state
and can yield events to communicate with other agents or terminate a workflow.
See Gemini Fullstack Agent Development Kit (ADK) Quickstart for an example of how
to implement a Custom agent.
ADK orchestration:
Implementing the ReAct loop
As we discussed in section 1, the ReAct paradigm is a
foundational pattern for agentic systems. ADK provides the
core abstractions and classes needed to implement this
dynamic, cyclical process in a structured way. Its LlmAgent
is specifically designed to execute this loop and handle the
transitions between the fundamental stages in the loop:
• Reasoning (thought): The LlmAgent class manages this
stage internally. It takes the user’s prompt and its current
internal state, then calls the underlying language model
to form a hypothesis and determine the next best action.
• Action (tool use and agent delegation): ADK enables this
stage through its flexible tool system. When the LlmAgent
decides to act, it can invoke a simple Python function or,
for more complex tasks, delegate the work to another
specialized sub-agent using the Agent-as-a-Tool pattern.
• Observation: ADK automatically captures the dictionary
returned by the tool or sub-agent and passes it back to the
LlmAgent. This output becomes the new information that
the agent integrates into its context, feeding it into the next
Reasoning step of the cycle.
ADK tools: A framework for agentic action
In ADK, an agent can use tools to perform actions beyond the
native capabilities of its core reasoning model. These defined
capabilities enable an agent to execute code, interact with
external systems, and act outside its own immediate execution
context. A tool is a Python function (or a Java method) that can
either implement self-contained logic or act as a wrapper for
more complex operations, such as making calls to an API, using
MCP to access a variety of external systems, or delegating
a task to another specialized agent locally or remotely via A2A.
This section outlines how to design effective tools and the
taxonomy of available tool types.
Pro tip
For a complete discussion, including code examples and
advanced usage patterns, refer to the ADK documentation.
By providing a native implementation of this essential pattern,
ADK abstracts away the boilerplate code, so you can quickly
translate the powerful concept of a ReAct loop into a working,
multi-step agent.
32
35. Section 1
Section 2: How to build AI agents
Section 3
Designing effective tools:
The API contract for the model
For a model to use a tool correctly, its definition must serve
as a clear and unambiguous API contract, composed of:
• Function signature: Use descriptive names for tools and
their parameters. Python type hints are mandatory, as they
provide the structural schema the model uses to generate
valid arguments.
• Docstring (the semantic core): This is the primary source
of semantic information for the model. A well-written
docstring must precisely define the tool’s purpose, usage
criteria, parameters, and expected return schema.
• Return schema: A tool must return a dictionary. While not
a strict syntactic requirement, it is a best practice to include
a status key (e.g., success or error) in this dictionary.
This structure is essential for the agent to reliably distinguish
between successful outcomes and failures in its
Observation step and reason about how to proceed.
• Stateful tools and ToolContext: For tools that need
to read or write to a persistent session state, an optional
tool_context: ToolContext parameter can be added
to the function signature. The agent automatically injects
this object, giving the tool access to a session-level
state dictionary.
Pro tip
For best practices and examples of how to define
parameters and tool schemas, structure effective prompts,
and implement complex, multi-agent workflows, check out
the ADK Samples repository.
A taxonomy of ADK tools
ADK provides a flexible architecture for implementing
tools, ranging from simple functions to interoperable
multi-agent systems.
Toolsets: Packaging related capabilities
A primary pattern in ADK is the Toolset, a class that
bundles a collection of related tools into a single,
configurable object (e.g., BigQueryToolset, MCPToolset).
Custom function tools
This is the most direct method for extending an agent
with proprietary logic.
• FunctionTool: The standard wrapper for synchronous
Python functions.
• LongRunningFunctionTool: A specialized tool for
asynchronous tasks or human-in-the-loop workflows.
Hierarchical and remote tools
ADK enables the creation of complex systems by
composing agents.
• Agent-as-a-tool: A delegation pattern where a parent
agent uses another specialized agent. This allows the
parent to invoke another agent, receive a response,
and maintain control to handle future input. (This is
distinct from a sub-agent delegation model, where full
conversational control is passed to a sub-agent and
all subsequent input will be handled by the sub-agent.)
• RemoteA2aAgent: For communication between
agents in different processes, ADK provides the
RemoteA2aAgent class, which uses the Agent2Agent
(A2A) protocol to seamlessly integrate distributed
systems.
Pre-built and integrated tools
ADK includes a suite of tools and wrappers to accelerate
development.
• Built-in tools: Ready-to-use tools like Google Search
and Code Execution.
• Google Cloud toolsets: Rich integrations for services
like Vertex AI Search and BigQuery.
• Third-party interoperability: Wrappers like
LangchainTool and CrewaiTool allow for the direct
reuse of tools from popular open-source ecosystems.
33
36. Section 1
Section 2: How to build AI agents
Section 3
Standardize with Model
Context Protocol
Model Context Protocol (MCP) is an emerging open standard
for connecting AI and LLMs with external data sources and
tools. You can plug your AI applications into various data
sources and tools without the hassle of building custom
point-to-point integrations for each one.
Pro tip
Use the open-source MCP Toolbox for Databases to
easily and securely connect your agents to a large array
of popular data sources.
With ADK, your agents can participate in this ecosystem
in two ways:
• Consume external tools: An ADK agent can act as
an MCP client, allowing it to use any tool exposed by
a third-party MCP server.
• Expose native tools: Developers can wrap their ADK
tools in an MCP server, making them securely available
to any other MCP-compliant agent or application.
MCP is like a universal adapter for an agent’s data sources and tools
AlloyDB
+ AlloyDB Omni
MySQL
Postgres
BigQuery
Agents for applications
Bigtable
Agent Development Kit
Cloud SQL
Dgraph
by Hypermode
Spanner
34
37. Section 1
Section 2: How to build AI agents
Section 3
Manage your data with
Google Cloud services
As outlined in the previous section, long-term, working, and
transactional memory each play a distinct role in an agent’s
data architecture. Here, we explain how to build it. ADK
provides the necessary patterns and integrations to map this
conceptual architecture directly to specific, scalable Google
Cloud data services.
1. Long-term knowledge base
(grounding, context, and analytics)
This is the agent’s permanent memory, combining a
searchable knowledge library, a record of user interactions,
and a repository for analytics.
• Vertex AI Search: Serves as the agent’s queryable
knowledge library for unstructured information. In ADK,
a VertexAISearchToolset allows an agent to ground its
responses by retrieving relevant information from a specific
set of documents.
• Firestore: Functions as the agent’s persistent user memory.
In ADK, it’s used to store and retrieve conversational history
and the state of long-running tasks, enabling a continuous,
personalized experience that can be resumed across
sessions.
• Cloud Storage: Acts as the agent’s durable file system.
ADK uses it as the source of truth for raw documents
(e.g., PDFs, images) that are then indexed by services like
Vertex AI Search.
• BigQuery: Functions as the agent’s analytical database.
The BigQueryToolset in ADK enables agents to answer
questions by executing complex analytical queries against
large, structured datasets.
2. Working memory
(caching and session state)
This is the agent’s high-speed, transient memory for managing
the immediate context of a live conversation.
• Memorystore: Provides a high-speed cache for the agent.
In ADK, its primary role is to store the results of frequent
or expensive tool calls, drastically reducing latency and
operational costs.
3. Transactional memory
(auditing and reliable execution)
This is the agent’s durable ledger for recording critical actions
and state changes with high integrity.
• Cloud SQL: Serves as the agent’s reliable system of
record. ADK enables patterns where tools log their actions
to Cloud SQL, creating a permanent, ACID-compliant audit
trail for every important agent-driven action.
• Cloud Spanner: Acts as a globally consistent backend
for mission-critical agent actions. In an advanced ADK
implementation, a tool representing a core business
process (e.g., process_global_order) would trigger
a transaction in a Spanner-backed system to ensure
global integrity.
4. The next frontier:
Distilled conversational memory
As an agent’s interaction history with a user grows over weeks
or months, providing the entire raw context to the model for
every query becomes inefficient and cost-prohibitive. Plus,
models can get confused.
Memory distillation is the next frontier. It uses an LLM
to dynamically and continuously distill long conversation
histories into a compact, structured set of essential facts
and preferences. The resulting curated, long-term memory
is far more efficient to retrieve and use.
This is an active area of research, but early patterns are
emerging. An example is Vertex AI Memory Bank, a managed
service on Vertex AI Agent Engine. It provides mechanisms
to implement memory distillation:
• Automated distillation: It can asynchronously process
conversation histories to automatically extract
and generate a list of salient facts about the user
(GenerateMemories).
• Agent-directed distillation: For more control, an agent
can use memory-as-a-tool to decide what specific
information is important enough to be explicitly written
to the memory bank (CreateMemory).
Focusing on a distilled set of memories rather than raw
history is more scalable, efficient, and “human-like”; ideal
for the next generation of agentic systems.
35
38. Section 1
Section 2: How to build AI agents
Section 3
Deploy to the managed runtime
with Vertex AI Agent Engine
ADK is deployment-agnostic by design. The core agent
logic you define in Python is decoupled from the serving
infrastructure so you can develop and test locally, then deploy
the same agent to various production environments.
BaseAgent
Develop
Build/package
Container
Package
Deploy to
Vertex AI Agent Engine
Deploy to
Cloud Run
Run to
Custom infrastructure
(Docker Host, GKE, On-Prem)
Self-managed
Deploy
ADK agents are exposed for deployment as standard
web services using FastAPI. The adk api_server command
automatically wraps your agent in a production-ready API
server, which can then be containerized.
While this container could be deployed to various services
on Google Cloud, the three primary managed deployment
targets for ADK agents are:
• Cloud Run: A managed compute platform for running
your agent as a container-based application. This is an
excellent choice for integrating your agent into an existing
microservices architecture or for use cases requiring
custom container configurations.
• Vertex AI Agent Engine: A fully managed, auto-scaling
service on Google Cloud specifically designed for
deploying, managing, and scaling AI agents built with
frameworks like ADK. It provides deep integration with the
Vertex AI ecosystem for MLOps, monitoring, and security.
• Google Kubernetes Engine (GKE): This managed
Kubernetes service is the best choice if you have an
existing Kubernetes-based infrastructure or are making
a strategic day-one decision to prioritize long-term
portability, deep architectural control, and the open-
source Kubernetes ecosystem. It provides the most
granular control over networking, stateful workloads,
and specialized hardware like GPUs and TPUs, making
it ideal for teams with platform engineering expertise
or those building complex, multi-service applications
that will need to scale.
36
39. Section 2: How to build AI agents
Section 1
Section 3
Note
It’s important to understand the relationship between
Google Cloud’s agent creation tools. Vertex AI Agent
Builder is the comprehensive suite of features for the
entire agent lifecycle, from discovery to deployment.
A core component of this suite is Vertex AI Agent Engine,
the managed service specifically designed to deploy,
manage, and scale your agents in production.
For this guide, when we discuss the production
runtime environment, we are referring to the Vertex
AI Agent Engine.
For startups using ADK, Vertex AI Agent Engine is the
recommended deployment target. It is specifically optimized
to be a cost-effective, auto-scaling solution, and it provides
the easiest and most direct path to a scalable, production-
ready agent. As a fully managed service, it abstracts the
underlying infrastructure, freeing your engineers to focus
on core agent logic rather than operational overhead.
As a service designed for agentic workloads, Vertex AI Agent
Engine provides several key benefits:
Core capabilities
• Automated scalability: Automatically handles scaling
to meet varying user loads.
• Security and authentication: Provides integrated identity
and access management.
• Framework agnostic: Supports agents built with various
frameworks, not just ADK.
• Agent lifecycle management: Provides APIs for creating,
reading, updating, and deleting your deployed agents.
Specialized agentic features
• Memory Bank: A managed service to dynamically
generate and retrieve long-term, personalized memories
based on users’ conversations.
• Example Store: Allows developers to provide and
manage few-shot examples to improve and steer agent
performance on specific tasks.
System architecture for a Gemini-powered agent engine
37
40. Section 1
Section 2: How to build AI agents
Section 3
Collaborate with Agent2Agent
communication
The A2A protocol’s rich partner ecosystem
The true power of specialized agents is unlocked when
they can collaborate. To enable this, Google champions the
Agent2Agent (A2A) protocol, an open standard that ensures
the agents you build today can discover, communicate
with, and securely coordinate actions with other agents,
regardless of who built them or what framework they use.
This commitment to an open, interoperable ecosystem
is central to Google Cloud’s agent strategy.
Key concepts of the A2A protocol include:
• Agent card: A digital “business card” (typically a JSON file
at a well-known endpoint) that an agent uses to advertise
its capabilities, endpoint URL, and authentication
requirements, enabling discovery by other agents.
• Task-oriented architecture: Interactions are framed
as “tasks.” A client agent sends a task request to a server
agent, which processes it and returns a response.
An agent can act as both a client and a server.
• Modality agnostic: A2A supports text, audio, and video
communication, reflecting the evolving, multimodal
nature of agent interactions.
How the A2A protocol works
Remote agent
End user
Client
38
41. Section 2: How to build AI agents
Section 1
Section 3
ADK agents can natively participate in this ecosystem.
They expose a standard HTTP endpoint and an agent.json
file, allowing them to be discovered and to communicate
with any other A2A-compliant agent.
Pro tip
Explore these A2A resources to get started:
• A2A Project Github org
• A2A protocol docs
• A2A protocol specification
Customer story
How BioCorteX uses the A2A protocol
to accelerate drug discovery.
BioCortex uses knowledge graphs and in-silico
simulations to model the complex interactions between
bacteria, drugs, and the host to uncover the hidden
interactions and and de-risk the drug development
process for our pharma partners.
Situation
In life sciences, connecting and transforming disparate
datasets into commercially relevant knowledge is
slow and uncertain. Hypothesis testing can take years,
hampered by poor models, conflicting theories, and
the sheer complexity of biology.
At BioCorteX, our Carbon Graph agents
are different. Unlike other agents, they
deal in facts not opinions or associations.
By traversing the world’s largest
mechanistic biology-based knowledge
graph [rather than using an LLM],
Carbon Graph agents do not suggest
hypotheses, they test their plausibility,
clinical relevance, and commercial
aspects, allowing full alignment across
R&D, Regulatory, and Commercial teams
within the organization.”
Solution
BioCorteX built a multi-agent system on GCP that
interrogates hypotheses from three dimensions:
biological plausibility, real-world clinical relevance,
and commercial importance. It uses Gemini-powered
agents, the ADK, and a graphRAG to navigate our
44 billion-connection knowledge graph of global
samples—all orchestrated via A2A.
Impact
What once took years now takes days. BioCorteX graph
agents deliver fully transparent scenario planning to key
decision makers across their portfolio, underpinning
high-level commercial considerations with deep scientific
knowledge—accelerating testing of new mechanisms and
therapeutic areas while cutting waste across the pipeline.
39
42. Section 2: How to build AI agents
Section 1
Section 3
2.2 A step-by-step guide:
Defining an LLM agent
Building an AI agent is an iterative process of definition,
testing, and deployment. This section focuses on the first
and most critical phase: defining the agent’s core identity,
instructions, and capabilities.
While the open source ADK Samples repository provides
a comprehensive library of ready-to-use agents, its purpose
is to show you what a finished agent looks like. This section,
in contrast, is designed to teach you how to think about
building an agent. It explains the architectural principles and
the strategic “why” behind each core component, giving you
the foundational knowledge needed to effectively use the
samples to build your own custom solutions.
To make this practical, let’s walk through the process of
building a Software Bug Assistant, an LlmAgent agent designed
to help a support team triage new bug reports.
1. Define the agent’s identity
First, you establish what the agent is and what it’s for.
This is done with three key parameters:
• name (required): A unique string identifier, crucial
for internal operations and multi-agent delegation.
For our example: software_bug_triage_agent.
• description (recommended): A concise summary
of capabilities, used by other agents to decide when
to route tasks. For our example: “Analyzes new software
bug reports, categorizes their priority, and assigns them
to the correct engineering team.”
• model (required): The underlying LLM that powers
the agent’s reasoning, such as gemini-2.5-flash.
2. Guide the agent with instructions
The instruction parameter is the most critical component
for shaping an agent’s behavior. It tells the agent its core
task, persona, constraints, and how to use its tools. For our
Software Bug Assistant, we would direct it to act as an expert
engineering manager, explain how to use its tools to look up
user data, and specify that its final output should be a JSON
object for our ticketing system.
An effective instruction should:
• Be clear and specific about the desired outcomes.
• Provide examples (few-shot prompting) for complex tasks.
• Guide tool use by explaining when and why they should
be used.
• Inject dynamic data from the agent’s state using
{variable} syntax.
Pro tip
Be precise, as your entire definition is a prompt
An LLM uses every part of an agent’s definition, from its
name and description to the names and descriptions of
its tools, for reasoning. And it interprets this information
with a high degree of literalism. Avoid ambiguous, unclear,
or contradictory naming and descriptions, which can
lead to “context poisoning,” where the agent becomes
confused, pursues incorrect goals, or fails to use its tools
correctly. Instead, treat every configuration string as
a carefully crafted instruction to the model.
Note
AI agents and their underlying frameworks are evolving
at an incredible pace. While this guide focuses on the
durable architectural principles and patterns for building
agentic systems, the specific code snippets and API details
that follow are a snapshot in time. We aim to teach the “why”
and “how” of agent design, not to provide source code to
copy and paste directly into a production solution.
For the most current implementation details, API signatures,
and best practices, always consult the official ADK
documentation and the ADK Samples repository.
40
43. Section 1
Section 2: How to build AI agents
Section 3
3. Equip the agent with tools 4. Complete the development lifecycle
Tools give your agent capabilities beyond its built-in reasoning,
allowing it to interact with the outside world. Our Software
Bug Assistant would need several tools to do its job, such as: You’re now ready to test and evaluate your agent’s
performance. This task is all about assessing the quality
of the agent’s output by examining its step-by-step execution,
or “trajectory.” The next section covers the important topic
of agent evaluation in detail.
• A function to get information about the user reporting
the bug (get_user_details(user_id)).
Once you’re confident it’s performing well, you need a
streamlined way to deploy it. This is where your prototype
becomes a production-ready application for your team or
customers, turning your agent into a reliable business tool.
• A function to search the codebase for relevant files
(search_codebase(file_name)).
• A function to create a ticket in a project management
system (create_jira_ticket(...)).
The LLM uses the tool’s name, docstring, and parameter
schema to decide which tool to call.
Pro tip
Test, test, and test again
Pro tip
Be brief and distinct
Each tool you define adds a new choice for the model to
consider. Especially when an agent has many tools, any
ambiguity or overlap in their descriptions can confuse
the model, leading to looping behaviors or incorrect tool
selection. To ensure the model can choose correctly,
make each tool’s name and description a clear and
unique signal of its purpose.
Agentic systems are non-deterministic and can exhibit
emergent behaviors. Standard unit tests are insufficient.
Rigorous evaluation is the only way to ensure the quality
and reliability of your agent. Focus your testing on
two key areas: the reasoning trajectory (the step-by-
step logic and tool use) and the final output quality (its
accuracy, helpfulness, and grounding). As extensive
benchmark testing shows, even state-of-the-art models
can produce hallucinations or get stuck in reasoning
loops, making continuous evaluation a critical part
of the development lifecycle.
Architecture of a software bug assistant with ADK Python
IT support
Software bug agent
(ADK Python)
Cloud Run
MCP Toolbox
for Databases
Cloud Run
Bug ticket database
Cloud SQL
PostgreSQL
Google Search tool
Agent-as-a-Tool
GitHub MCP Server
MCP Tools
StackExchange tool
LangChain Tool
41
44. Section 2: How to build AI agents
Section 1
Section 3
Customer story
How Box uses ADK and the A2A protocol
to accelerate content development.
Box is an Intelligent Content Management platform
that enables organizations to fuel collaboration,
manage the entire content lifecycle, secure critical
content, and transform business workflows.
Situation
Critical business processes like compliance checks,
contract management, and loan approvals are slowed by
employees having to search and interpret vast amounts
of information stored across documents in Box. This
creates inefficiency and delays critical decisions.
We’re entering a new era where
AI agents will transform how work gets
done—and content is at the center
of it all. With Box as the secure content
layer and Google Cloud’s A2A Protocol
enabling seamless collaboration across
ecosystem, we’re unlocking powerful
new ways to automate business
processes, accelerate decision-
making, and drive real outcomes
for our customers.”
Solution
Box is introducing an A2A-enabled agent, built with
Google’s ADK and using Gemini. The agent connects
directly to the Box Intelligent Content Cloud, allowing
users to ask complex questions in natural language
and receive summarized, contextual answers and
insights from their documents instantly.
Impact
This dramatically accelerates content-centric workflows
and improves the quality of decision-making. It lays
the foundation for more advanced transactional agents
that can govern, manage, and initiate processes like
e-signatures and approvals, fundamentally transforming
how work gets done in the company.
42
45. Section 1
Section 2: How to build AI agents
Section 3
2.3 Govern and scale your
agent workforce with
Google Agentspace
As your startup moves from building a single agent to
deploying a portfolio of specialized agents, you face a new
set of challenges: How do you manage them? How do non-
technical team members leverage them? How do you govern
their access to data and tools?
Google Agentspace solves these scaling problems. This single,
secure platform allows you to create, govern, and orchestrate
your entire AI agent workforce, unifying disparate applications
and data sources. It complements ADK’s code-first development
by providing the framework to scale agent usage across your
entire organization and manage them effectively.
You can use Google Agentspace to:
• Unify and access company data: Google Agentspace
breaks down data silos by using out-of-the-box
connectors to your existing company applications
(e.g., Microsoft Sharepoint, Google Workspace, Jira).
It applies Google’s multimodal search technology across
this connected data, allowing any employee to get instant
answers and synthesize insights from a central source
of truth while respecting all existing access controls.
Try these prompts in Google Agentspace:
Schedule our weekly team sync for Thursday at 10 AM.
Summarize this week’s updates for the #product Slack channel.
Create a meeting agenda to discuss investor prep.
Pro tip
Download the Google Agentspace prompting guide
for more templates.
• Enable team-wide automation: While ADK is ideal for
complex, code-first agent development, Google Agentspace
empowers your entire organization to automate workflows.
Domain experts in product, marketing, or operations can
use the no-code Agent Designer to build their own custom
agents using a prompt-driven interface. This turns their
specific knowledge into automated solutions without
requiring engineering resources.
• Govern and orchestrate a fleet of agents: Google
Agentspace provides a single platform to manage
and govern agents built with ADK, the no-code designer,
or from partners. The Agent Gallery acts as a central
portal for your team to discover, manage, and deploy
them all, including both your custom-built solutions
and pre-built Google Agents designed for complex
tasks like deep research or idea generation.
43
46. Section 2: How to build AI agents
Section 1
Section 3
Customer story
How Zoom’s AI agent automatically schedules
Zoom meeting from Gmail context, with Google
Agentspace integration.
Zoom is a communication technology company that
provides an open, AI-first work platform used for virtual
meetings, webinars, chat, online collaboration, customer
experience, and more.
Situation
Zoom’s AI-first strategy centers on transforming
AI Companion into a fully agentic framework-not only
capable of advanced reasoning and task orchestration,
but also seamlessly integrated with customer’s key
third-party systems. By enabling collaboration with
other AI agents, Zoom drives more meaningful work
outcomes through an open, interoperable ecosystem.
Our contribution to the A2A protocol
enables deeper integration with Google
Cloud and other third-party platforms,
giving customers flexibility and choice.”
Solution
Zoom AI Companion is integrating with Google Agentspace
to streamline meeting scheduling.Launching later
this summer, this collaboration will allow A2A-enabled
AI agents to automatically schedule Zoom Meetings
from Gmail context, update Google Calendar, and keep
participants informed, eliminating the back-and-forth
of manual scheduling.
Impact
• Reduced technical barriers in cross-platform
AI integration.
• Seamless interaction between Zoom AI
Companion and external A2A-enabled agents
without custom code.
• Enhanced workflow automation and improved
efficiency for enterprise customers.
• Future support for more sophisticated
multi-agent AI interactions.
44
47. Section 1
Section 2: How to build AI agents
Section 3
2.4 Other options for
building agents
Experiment with Gemini CLI
For startups needing an immediate, cost-effective way
to experiment with AI, Gemini CLI is an open-source agent
that brings Gemini directly to your terminal. It offers:
• Significant cost savings: Get free access to Gemini 2.5
with generous usage limits (1 million token context,
60 queries per minute).
• Enhanced productivity: By integrating into existing
developer workflows, it accelerates coding, debugging,
and documentation.
• Total flexibility: As an open-source tool (Apache 2.0),
you can audit, modify, and embed it into your toolchain,
avoiding vendor lock-in and enabling deep customization.
Pro tip
Check out Gemini CLI configured for ADK development.
Accelerate development
with Firebase Studio
A backend agent, even a powerful one built with ADK, is only
one part of a complete product. To bring it to life, you need
to build an entire full-stack application around it, including the
user interface, databases, and hosting. Firebase Studio is an
integrated, cloud-based workspace that uses agentic AI to
accelerate the entire development lifecycle. Teams can use it
to handle everything from UI prototyping and code generation
to secure deployment on Google Cloud’s infrastructure.
Together, ADK for the agent’s backend logic, the Agent
Starter Pack for production infrastructure (which we explore
in section 3), and Firebase Studio for the full-stack application
provide a complete, end-to-end toolkit for a startup to build
and deploy a powerful, state-of-the-art agentic system.
Firebase Studio accelerates the entire development
lifecycle with AI:
• Fast setup: Use App Prototyping Agent to create a new
project using natural language, mockups, or screenshots.
Start by selecting from a large catalog of templates
for popular frameworks and languages, or import
an existing project.
• Gemini in Firebase: Use AI assistance on tasks like
coding, debugging, testing, refactoring, explaining,
and documenting code.
• Collaboration: Share workspaces with team members,
and provide a URL for early testers to preview apps.
• Optimize: Preview apps as users see them with built-
in web previews and Android emulators, and test and
optimize with access to thousands of extensions in the
Open VSX registry.
• Deploy: Publish to Firebase App Hosting with a few clicks,
or deploy production apps to Cloud Run, Firebase Hosting,
or your own custom infrastructure.
Try these prompts with the App Prototyping agent:
Generate a customer support dashboard that ingests
data from Zendesk and displays key metrics like ticket
volume and resolution time.
Create a B2B SaaS application with user authentication,
a PostgreSQL database, and a subscription billing page.
Build a full-stack application for an internal bug tracking
system with a form for submission and a kanban board
to view ticket status.
You can also start by selecting from a large catalog
of templates for popular frameworks and languages,
or import an existing project.
45
48. Section 1
Section 2: How to build AI agents
Section 3
Key takeaways:
From build to scale
Your goal
Best option
Build a custom, multi-agent
system from code. Use the open-source Agent Development Kit (ADK).
Deploy, scale, and manage
your agent in production. Deploy it on Vertex AI Agent Engine.
Give your agent the power
of long-term memory. Use the Memory Bank feature within
Vertex AI Agent Engine.
Enable your agent to discover
and talk to other agents. Use the Agent2Agent (A2A) protocol,
an open standard for agent communication.
Build a complete AI-powered
app from a prompt. Use Firebase Studio for an AI-assisted
development workspace.
Quickly experiment with
Gemini in your terminal. Use the Gemini CLI for a simple, command-line interface.
46
49. Section 1
Section 2: How to build AI agents
Section 3
Ready to turn your
AI vision into reality?
We’re here to help.
Learn how to build more generative AI applications
with on-demand Startup School sessions.
Start now
Get up to $350k USD in Google Cloud credit
with the Google for Startups Cloud Program.
Apply now
Contact our Startups team.
Get in touch
Stay connected and get all our latest updates by
subscribing to the Google Cloud Startup Newsletter.
Subscribe
47
50. Section 3
Ensuring AI agents
are reliable
b
and responsible
51. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
Due to the non-deterministic nature of LLM-based
systems, it can be hard to achieve production-grade
reliability. Moving beyond superficial “vibe-testing”
requires a rigorous engineering approach to ensure
an agent operates safely and provides consistent value.
This section details the methodologies and tooling necessary
to address these challenges, focusing on three key areas:
• Correctness and reliability: Evaluating the accuracy of
final outputs and the validity of intermediate reasoning steps.
• Performance and scalability: Measuring and optimizing
agent latency and throughput under load.
• Safety and responsibility: Implementing safeguards,
monitoring for undesirable behavior, and ensuring the
agent operates within defined boundaries.
These practices are the tangible application of Google’s
commitment to responsible AI, allowing startups to build
powerful and reliable agents aligned with industry-leading
principles for safety.
Prefer audio? Listen to the podcast version
of this section, created with NotebookLM.
Agents hold the key to a new level
of productivity, but their success
depends on our guidance.”
Harrison Chase
CEO and Co-Founder of LangChain
Pro tip
Production-grade observability means looking beyond
application metrics. You must also measure low-level
operational metrics like CPU and memory usage.
Diligently tracking resource consumption is essential for
diagnosing performance bottlenecks, optimizing your
runtime, and directly reducing operational costs. ADK
and the Agent Starter Pack provide native OpenTelemetry
support, allowing you to pipe crucial operational data
directly into your existing monitoring tools.
This podcast was created using NotebookLM with the
prompt: ”As a podcast host, generate a podcast for
developers and technical founders. Introduce AgentOps
as the framework for building reliable AI, moving beyond
informal testing to a rigorous, automated process.
“Explain how AgentOps evaluates an agent’s reasoning,
accuracy, and safety, and how it mitigates risks like
misinformation and security vulnerabilities. Describe how
the Agent Starter Pack, working with the Agent Development
Kit (ADK), quickly implements this with pre-configured
tools for infrastructure, CI/CD, and continuous evaluation.
Conclude by highlighting that this disciplined approach
is a competitive advantage and direct listeners to Google’s
resources for startups.”
49
52. Section 1
3.1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
AgentOps: A framework
for production-ready agents
Agent Operations (AgentOps) is an operational methodology
that addresses the challenges of reliability and responsibility
in production. It adapts the principles of DevOps, MLOps,
and DataOps to the unique challenges of building, deploying,
and managing AI agents across their lifecycle. And it gives
you a systematic, automated, and reproducible framework
for handling the complexities of non-deterministic, LLM-
based systems in production environments.
A robust AgentOps strategy systematizes the development
process, providing continuous feedback loops to improve
an agent’s reliability, safety, and performance across its tools,
reasoning capabilities, and underlying models.
A systematic framework
for agent evaluation
Evaluating non-deterministic, agentic systems is one
of the most complex challenges in modern software
engineering. Traditional testing often focuses on lexical
correctness, but agent evaluation must address two harder
problems: semantic correctness (did the agent understand
and helpfully answer the user’s intent?) and reasoning
correctness (did the agent follow a logical and efficient
path to its conclusion?).
As we discussed in section 1, the cognitive architecture
governing this reasoning is often a framework like ReAct,
which establishes a dynamic loop where the agent
interleaves thought and action. A failure at any point
in this loop can lead to an incorrect outcome. Therefore,
a rigorous, multi-layered evaluation framework is required.
This framework is implemented using a combination
of the ADK for core agent logic and instrumentation, and
the Agent Starter Pack for production-grade infrastructure,
automation, and observability.
Layer 1: Component-level evaluation
(deterministic unit tests)
This layer focuses on the predictable, non-LLM components
of the agent system.
• Objective: To verify the lexical correctness of individual
building blocks, ensuring that agent failures don’t stem
from simple bugs in its components.
• What to test:
• Tools: Expected behavior with valid, invalid,
and edge-case inputs.
• Data processing: Robustness of parsing
and serialization functions.
• API integrations: Handling of success, error,
and timeout conditions.
• Implementation:
• ADK defines the agent’s tools as Python functions
(or Java methods). These functions are the direct
subjects of component-level testing.
• Agent Starter Pack provides the testing infrastructure.
It generates a project with a standard pytest environment
configured in the tests/unit/ directory. Developers can
immediately write unit tests for their ADK-defined tools
and run them via the make test command.
50
53. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
Layer 2: Trajectory evaluation
(procedural correctness)
This is the most critical layer for evaluating the agent’s
reasoning process. A “trajectory” is the full sequence
of Reason, Act, and Observe steps the agent takes
to complete a task.
• Objective: To verify the agent’s reasoning correctness
within the ReAct cycle.
• What to test:
• Reason step: Does the agent correctly assess the
goal and current state to form a logical hypothesis
for the next step?
• Act step: Does it select the correct tool (Tool Selection)
and correctly extract and format the arguments for
that tool (Parameter Generation)?
• Observe step: Does it correctly integrate the output
from the tool to inform the next Reason step of the cycle?
• Implementation:
• ADK’s core runtime executes the agent’s ReAct
loop, integrating directly with Google Cloud Trace
to instrument each Reason, Act, and Observe step.
This allows developers to visualize the entire trajectory,
inspect tool inputs and outputs, and examine the
model’s chain of thought to debug its reasoning.
• Agent Starter Pack automates and scales trajectory
evaluation. A tests/integration/ directory creates
a “golden set” of prompts with expected ReAct
trajectories. The automated CI/CD pipeline (set
up with agent-starter-pack setup-cicd) runs these
tests on every pull request to prevent regressions.
The starter pack’s observability infrastructure is what
captures the trace data emitted by the ADK agent.
Layer 3: Outcome evaluation
(semantic correctness)
This layer evaluates the final, user-facing response generated
after the ReAct loop has concluded.
• Objective: To verify the semantic correctness, factual
accuracy, and overall quality of the final answer.
• What to test:
• Factual accuracy and grounding: Is the answer correct
and verifiably based on information gathered during
the Observe steps?
• Helpfulness and tone: Does the response fully address
the user’s need in the appropriate style?
• Implementation:
• ADK’s toolset is key for verifying factual accuracy.
Developers can create specialized tools or use APIs
for grounding verification. These tools, called during
the Act step, programmatically check if the agent’s
final answer is supported by the context it retrieved,
providing a quantitative measure against hallucination.
• Agent Starter Pack provides the platform for running
these evaluations at scale. It integrates with Vertex AI’s
Gen AI evaluation service for LLM-as-judge scoring.
Its built-in UI playground includes feedback mechanisms
that log human ratings directly to BigQuery, enabling
high-fidelity HITL evaluation.
Layer 4: System-level monitoring
(in-production)
Evaluation doesn’t stop at deployment. Continuously
monitoring the agent’s live performance is critical.
• Objective: To track real-world performance and detect
operational failures or behavioral drift.
• What to monitor: Tool failure rates, user feedback scores,
trajectory metrics (e.g., number of ReAct cycles per task),
and end-to-end latency.
• Implementation:
• The ADK agent, running in production, is the source
of the operational data, emitting events and traces
for every live user interaction.
• Agent Starter Pack provides a production-grade
observability stack out-of-the-box. It automatically
configures OpenTelemetry, a Log Router to BigQuery,
and provides templates for Looker Studio dashboards.
This allows teams to immediately track agent
performance, analyze trends, and debug issues using
data from real-world usage without additional setup.
This comprehensive and practical methodology for agent
evaluation is the tangible implementation of a robust
AgentOps strategy, moving teams beyond informal “vibe-
testing” to a systematic, automated, and reproducible process.
By dissecting evaluation into component, trajectory, outcome,
and system-level monitoring, it directly addresses the core
domains of AgentOps.
Adopting a systematic evaluation framework is not merely
a best practice but a competitive advantage. It establishes
a rigorous, data-driven, and automated process that allows
teams to innovate faster, deploy with confidence, and build
agents that are demonstrably safer and more effective.
• Completeness: Does the response contain all necessary
information?
51
54. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
The Agent Starter Pack includes these key components:
AgentOps toolkit: ADK
and Agent Starter Pack
Automated CI/CD pipelines implement the principles
of AgentOps, so that any change to the agent’s code, tools,
or prompts triggers a standardized process of building,
unit testing, and quantitative evaluation against a predefined
dataset. This automated evaluation step is essential
for preventing regressions and provides continuous,
objective feedback on the agent’s performance and safety
before deployment.
To accelerate adoption of AgentOps principles, the
Agent Starter Pack provides a production-ready reference
implementation. Its holistic templates address common
challenges (e.g., deployment and operations, evaluation,
customization, and observability) when building and deploying
AI agents. Put simply, it bootstraps a new agent project with
the necessary infrastructure and pipelines, so developers
can focus on core logic.
Section
Pro 3.
tip
• Infrastructure as Code (Terraform): Provides reproducible
templates to provision and manage the agent’s cloud
environment, including services like Cloud Run, IAM
permissions, and networking.
• CI/CD pipelines (Cloud Build): A pre-configured
cloudbuild.yaml file automates the build, unit testing,
quantitative evaluation, and deployment process,
directly implementing the AgentOps CI/CD workflow.
• Observability and logging (Cloud Trace and Cloud
Logging): Establishes the foundation for monitoring and
debugging by integrating with Cloud Trace for in-depth
analysis of agent execution traces and Cloud Logging
for centralized log management.
• Data integration (BigQuery): Includes foundational
components for agents that need to connect to and
analyze structured enterprise data using BigQuery.
• Continuous evaluation (Vertex AI evaluation): Integrates
with Vertex AI to run evaluation datasets against agent
changes, continuously measuring performance against
the key domains discussed previously.
You can create a new, production-ready agent project
with a single command: uvx agent-starter-pack create
my-agent -a adk@gemini-fullstack.
High-level architecture of the Agent Starter Pack
Evaluation
High level
architecture
LLMs
Evaluation
Vertex AI
evaluation
Data
LLMs
Vertex AI
Model Garden
Vector store
Adapt to any
store
Monitoring
Looker Studio
Data storage
and analysis
BigQuery
LLM orchestration
Observability
Agent orchestration
Choose your framework, includes samples
using Google ADK, LangGraph, CrewAI
Observability
Cloud Trace and
OpenTelemetry
Logging
Cloud Logging
User
LLM orchestration
FrontEnd
Serving
Vertex AI Agent
Engine
IaC and CI/CD
Serving
Cloud Run
OR
CI/CD
Cloud build
Infrastructure
as code
Terraform
API server
FastAPI
52
55. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
Better together: ADK and
Agent Starter Pack
ADK and the Agent Starter Pack are engineered to provide
a clear separation between an agent’s application logic
and its operational lifecycle, enabling a robust and scalable
development process.
Essentially, ADK is used to write the agent’s application code,
while the Agent Starter Pack provides the production-ready
operational baseline to run and manage that code at scale.
• ADK handles the agent’s runtime behavior: As a Python/
Java SDK, ADK provides the APIs and core abstractions
for defining an agent’s application logic. Developers use
it to implement orchestration flows, define tools, and
configure interactions with LLMs.
• The Agent Starter Pack handles the operational
environment: As a scaffolding tool, it generates
the infrastructure as code (Terraform) to provision
deployment targets (e.g., Cloud Run) and the CI/CD
pipeline configurations (Cloud Build) to automate
the entire lifecycle.
This separation manifests in a five-step workflow:
1. Bootstrap with the Agent Starter Pack: A developer runs
a single command to generate a new project containing
all necessary operational components, including Terraform
files for infrastructure, Cloud Build configurations
for CI/CD, and skeleton files for evaluation datasets.
2. Develop with ADK: Inside this structure, the developer
uses ADK to write the agent’s application logic,
implementing custom tools, composing agents,
and writing the core instructions.
3. Commit and automate: When code is committed to
the source repository, the pre-configured CI/CD pipeline
managed by Cloud Build is automatically triggered.
4. Evaluate continuously: The pipeline builds the ADK
agent into a container and then executes a quantitative
evaluation against a predefined test set, programmatically
validating the agent’s performance and safety.
5. Deploy confidently: Upon a successful evaluation,
the pipeline automatically deploys the new, validated
version of the agent to its target production environment.
By integrating ADK’s development framework with the
Agent Starter Pack’s operational automation, you establish
a complete, end-to-end MLOps/DevOps process specifically
tailored for building and managing production-grade
AI agents. It’s AgentOps at scale.
53
56. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
3.2 Build responsible and secure
AI agents with AgentOps
Building powerful agents comes with the non-negotiable
responsibility of ensuring they are safe, secure, and aligned.
This means designing them from the ground up with
safeguards to prevent harmful or unintended outcomes,
including unfair bias, privacy violations, and security
vulnerabilities.
Addressing this requires a structured approach. The diagram
below provides a high-level overview of common risks and
the types of technical and procedural safeguards used
to mitigate them. While this is a valuable starting point,
for a comprehensive guide to standards and best practices,
we strongly recommend consulting Google’s Secure AI
Framework (SAIF).
Not performing as
intended (e.g., safety,
quality, accuracy) Misapplication
and/or harmful use by
developers or users
Safety attributes Recitation checks
Customer feedback
channels
Content moderation
API
As AI agents integrate into our lives,
it’s crucial for us to address new challenges
around trust, privacy, and security.
It’s important for us to think about
security and privacy, to ask ourselves:
how do we build trustworthy products?”
Jia Li
Co-Founder, President
and Chief AI Officer of LiveX AI
Creating the
impression of having
capabilities it does
not actually have Creating or amplifying
negative societal
biases and harms
Creating or worsening
inequality or other
socio-econonic harms Unsafe deployment
(e.g., too early or
with insufficient
testing) Creating or worsening
information hazards
(e.g., lack of
groundedness,
non-factuality,
confirmation bias)
Terms of service UI disclaimers Acceptable use
policy Acceptable use
policy Model evaluations RAI guides Model evaluations Recitation checks
Bias evaluation
tooling Acceptable use
policy Bias evaluation
toolings Bias evaluation
toolings
Safety attributes Model cards Safety attributes Model monitoring RAI guides Model evaluations
Privacy restrictions
54
57. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
ADK and the Agent Starter Pack deliver a defense-in-depth
strategy for this critical area. First, you can implement
fine-grained, application-level safety controls with ADK.
And, second, the Agent Starter Pack automates the
deployment of the hardened cloud infrastructure that
enforces these controls at scale.
This combined approach addresses key aspects of safety
and compliance:
• Secure infrastructure and access control: The Agent
Starter Pack uses Terraform to provision a secure
foundation, deploying your agent to environments like
Cloud Run and configuring specific IAM roles to enforce
the principle of least privilege. The tools you define in ADK
then operate within these strict cloud-level permissions,
ensuring the agent cannot access unauthorized
resources even if its own logic is compromised.
• Auditing and monitoring: The detailed observability
in ADK creates a granular trace of every thought
and tool call the agent makes. The Agent Starter Pack
operationalizes this by automatically configuring log
sinks that route this data to BigQuery for long-term,
secure storage. This creates the durable audit trail
necessary for compliance reviews and incident response.
Security is a partnership. While the ADK provides the
framework for an agent’s cognitive architecture and
the Agent Starter Pack provides the components to deploy
it, they operate within the larger Google Cloud ecosystem.
It all gives you a formidable security posture built on
a secure-by-design foundation, with integrated controls
designed to defend any workload.
• Input and output guardrails: Within ADK, you can
implement application logic to validate prompts for
potential injection attacks and filter the agent’s final
outputs for harmful content. The Agent Starter Pack
makes these guardrails robust by integrating them
into its CI/CD pipeline. You can then run automated
security tests against every code change to check
for vulnerabilities before they reach production.
55
58. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
Key takeaways:
Building reliable agents
Your goal
Best option
Manage your agent’s lifecycle
professionally. Adopt AgentOps to automate processes from development
to deployment and monitoring.
Ensure your agent is accurate
and safe before going live. Implement automated evaluation in your CI/CD pipeline
to rigorously test for quality, grounding, and safety.
Track your agent’s real-world
performance, cost, and errors. Set up monitoring using observability tools to get real-time
data on latency, token usage, and tool call success rates.
Figure out why your agent
made a specific decision. Inspect the agent’s trajectory (its “chain of thought”) using
logging and tracing tools to debug its reasoning process.
Secure your agent, its data,
and its tool access. Apply AgentOps security principles, which include
infrastructure security, data governance,
and compliance controls.
Get started with
AgentOps quickly. Use the Agent Starter Pack for pre-configured templates
for CI/CD, evaluation, and infrastructure.
59. Section 1
Section 2
Section 3: Ensuring AI agents are reliable and responsible
Ready to turn your
AI vision into reality?
We’re here to help.
Learn how to build more generative AI applications
with on-demand Startup School sessions.
Start now
Get up to $350k USD in Google Cloud credit
with the Google for Startups Cloud Program.
Apply now
Contact our Startups team.
Get in touch
Stay connected and get all our latest updates by
subscribing to the Google Cloud Startup Newsletter.
Subscribe
57
60. More from Google’s
full AI stack
Build quickly with Gemini on Google AI Studio.
See all available Gemini models.
Explore now
Spotlight
Spotlight
Gemini 2.5 Flash Image (aka Nano Banana) Veo and Imagen
This image generation and editing model enables you
to blend multiple images into one, maintain character
consistency for rich storytelling, make targeted
transformations using natural language, and use
Gemini’s world knowledge to generate and edit images. These cutting-edge models enable you to generate
high-quality videos and images from text prompts,
edit existing visuals with natural language, and create
immersive storytelling experiences with advanced
visual synthesis capabilities.
58
61. Conclusion
The journey from prototype to a production-grade system is about
disciplined engineering. By using a code-first framework like ADK
and the operational principles in this guide, you can move beyond
informal “vibe-testing” to a rigorous, reliable process for building
and managing your agent’s entire lifecycle.
For your startup, this disciplined approach becomes a powerful
competitive advantage. Your team can iterate and innovate faster,
automating resource-intensive evaluations along the way.
Plus, you can scale with confidence, without compromising
on safety or security.
As this guide has shown, Google Cloud supports this innovation,
from its purpose-built AI hardware and unified data platform,
to the models, services, and tools needed to transition your
concept into a sophisticated AI system. The platform is the
foundation; your unique vision and the principles outlined in this
guide are the blueprint. Together, they form the basis for building
the next generation of intelligent systems that will drive your
startup forward.
62. Resources
• AdkApp: Develop and deploy agents on Vertex
AI Agent Engine.
• Agent Development Kit (ADK): ADK is a flexible and
modular framework for developing and deploying
AI agents.
• Agent2Agent (A2A): This is an open protocol enabling
communication and interoperability between opaque
agentic applications.
• Agent Starter Pack: Get production-ready agents on
Google Cloud, faster. Go from idea to deployment faster
with pre-built templates and tools.
• BigQuery: BigQuery is Google Cloud’s fully managed,
petabyte-scale, and cost-effective analytics data
warehouse that lets you run analytics over vast amounts
of data in near real time.
• Check grounding API: As part of your RAG experience in
AI Applications, you can check grounding to determine
how grounded a piece of text (called an “answer candidate”)
is in a given set of reference texts (called “facts”).
• Cloud Functions API: This API manages lightweight user-
provided functions executed in response to events.
• Cloud Run: Run frontend and backend services, batch jobs,
host LLMs, and queue processing workloads without the
need to manage infrastructure.
• Cloud Storage bucket: Buckets are the basic containers
that hold your data. Everything that you store in Cloud
Storage must be contained in a bucket.
• Colab Enterprise: Colab Enterprise is a collaborative,
managed notebook environment with the security
and compliance capabilities of Google Cloud.
• Example Store: Example Store lets you store and
dynamically retrieve few-shot examples.
• Firestore: Firestore is a highly scalable NoSQL database
for your web and mobile applications.
• Gemini 2.5 Flash: Gemini 2.5 Flash is designed to control
the trade-off between quality, cost, and speed.
• Gemini 2.5 Flash Image (aka Nano Banana): Gemini can
generate and process images conversationally. You can
prompt Gemini with text, images, or a combination of
both allowing you to create, edit, and iterate on visuals
with unprecedented control
• Gemini 2.5 Pro: Gemini 2.5 Pro is our most advanced
reasoning Gemini model, capable of solving complex
problems.
• Gemini CLI: Free and open source, brings Gemini 2.5
directly into developers’ terminals — with unmatched
access for individuals.
• Gemma: A collection of lightweight, state-of-the-art open
models built from the same technology that powers our
Gemini models.
• Gen AI evaluation service: The gen AI evaluation service
in Vertex AI lets you evaluate any generative model or
application and benchmark the evaluation results against
your own judgment, using your own evaluation criteria.
• Google AI Studio: Google AI Studio is the fastest way
to start building with Gemini, our next generation family
of multimodal generative AI models.
• Google Cloud Observability: Google Cloud Observability
includes observability services that help you to understand
the behavior, health, and performance of your applications.
• Google Kubernetes Engine (GKE): GKE is the most
scalable and fully automated Kubernetes service.
Put your containers on autopilot and securely run your
enterprise workloads at scale – with little to no Kubernetes
expertise required.
• GraphRAG: GraphRAG on Google Cloud combines
knowledge graphs with Retrieval-Augmented Generation
(RAG) to enhance the accuracy, context, and explainability
of large language models (LLMs).
• Imagen: Imagen on Vertex AI brings Google’s state-of-
the-art image generative AI capabilities to application
developers.
• MCP Toolbox for Databases: This is an open source MCP
server that helps you build generative AI tools so that
your agents can access data in your database.
60
63. • Model Context Protocol (MCP): MCP is an open
protocol that standardizes how applications provide
context to LLMs.
• Model evaluation in Vertex AI: The predictive AI evaluation
service lets you evaluate model performance across
specific use cases.
• Model Garden on Vertex AI: Jump-start your ML project
with a single place to discover, customize, and deploy a
wide variety of models from Google and Google partners.
• Model tuning: Model tuning is a crucial process in adapting
Gemini to perform specific tasks with greater precision
and accuracy.
• ReAct: Orchestration with a ReAct (reasoning + action)
agent involves a multi-turn interaction between an
application and a model (or models) where the agent
manages conversations, transactions, and LLM instructions.
• Responsible AI: To aid developers, the Vertex AI Studio has
built-in content filtering, and our generative AI APIs have
safety attribute scoring to help customers test Google’s
safety filters and define confidence thresholds that are
right for their use case and business.
• Veo: You can use Veo on Vertex AI to generate new videos
from a text prompt or an image prompt.
• Vertex AI Agent Engine: Vertex AI Agent Engine is a set
of services that enables developers to deploy, manage,
and scale AI agents in production.
• Vertex AI notebooks: Access every capability in Vertex AI
Platform to work across the entire data science workflow–
from data exploration to prototype to production.
• Vertex AI Platform: Vertex AI is a fully managed, unified AI
development platform for building and using generative AI.
• Vertex AI RAG Engine: Vertex AI RAG Engine is a data
framework for developing context-augmented LLM
applications.
• Vertex AI Search: Vertex AI Search brings together the
power of deep information retrieval, state-of-the-
art natural language processing, and the latest in LLM
processing to understand user intent and return the most
relevant results for the user.
• Vertex AI Studio: Streamline your foundation model
workflows with Vertex AI Studio. Rapidly prototype,
refine, and seamlessly deploy models to your applications.
• Retrieval-Augmented Generation: RAG is an AI framework
that combines the strengths of traditional information
retrieval systems (such as search and databases) with
the capabilities of generative LLMs.
• Vector database: A vector database is any database that
allows you to store, index, and query vector embeddings,
or numerical representations of unstructured data, such
as text, images, or audio.
© 2025 Google LLC 1600 Amphitheatre Parkway, Mountain View, CA 94043
61
64. Questions?
Ask our
Startups team.
Contact us