AppsScriptPulse

Integrating Google Workspace as subagents for the Gemini CLI with Google Apps Script

This article explores integrating remote subagents built with Google Apps Script into the Gemini CLI using the Agent-to-Agent (A2A) protocol.

Google recently announced that subagents have arrived in the Gemini CLI, allowing developers to extend the capabilities of the command-line interface by connecting it to external tools and services. This update provides a way to build more complex, multi-step workflows where Gemini can call upon specialized agents to perform specific tasks.

For the Google Apps Script community, this opens up interesting possibilities for connecting local terminal workflows with the Google Workspace ecosystem. Kanshi Tanaike has already explored this area, sharing a detailed guide on integrating remote subagents built with Google Apps Script with the Gemini CLI.

Extending the CLI with Apps Script

In his post, Tanaike-san demonstrates how to bridge the gap between a local environment and remote script functions. By deploying an Apps Script as a web app, it can act as a “remote subagent” that the Gemini CLI calls to perform actions within Google Workspace or process data using the Apps Script environment.

As Tanaike-san notes:

“By using Google Apps Script as a subagent, Gemini CLI can interact with Google Workspace services like Google Drive, Google Sheets, Google Docs.”

You can find the full technical breakdown, including the necessary code snippets and configuration steps, in Kanshi Tanaike’s blog post.

Source: Integrating Remote Subagents Built by Google Apps Script with Gemini CLI | tanaike – Google Apps Script, Gemini API, and Developer Tips

Automating YouTube trend analysis at scale with Google Apps Script, Vertex AI and Terraform

Understanding and spotting trends in organic YouTube traffic based on specific queries can often be a manual and time-consuming process. To solve this, the Google Marketing Solutions team has published an open-source repository for a tool called YouTube Topic Insights. Developed by Google Customer Solutions Engineer Francesco Landolina, the project transforms raw video data into actionable summaries by combining the data-gathering capabilities of the YouTube Data API with the video and language understanding of Gemini models.

For Google Apps Script developers interested in building scalable tools, this repository is a fantastic example of using Vertex AI at scale. It orchestrates complex, long-running processes directly from a Google Sheet and pairs them with Vertex AI batch prediction. In this post we will highlight some interesting infrastructure setup and the pattern used to bypass common Apps Script execution limits.

Project architecture and setup

The solution is divided into two main components that make it easier to setup and process the data:

  1. GCP Infrastructure as Code: The necessary cloud environment is defined using Terraform infrastructure as code. This makes the deployment easier by enabling required APIs like Vertex AI and the YouTube Data API. While the Terraform script attempts to set up the OAuth Consent Screen using the google_iap_brand resource, you will need to configure the OAuth Consent Screen manually within the Google Cloud Console as the official Terraform documentation notes that the underlying IAP OAuth Admin APIs were permanently shut down on March 19, 2026.
  2. Google Apps Script: The brain of the operation lives within a Google Sheet. It calls the YouTube API to find videos, prepares and submits analysis jobs to Vertex AI batch prediction, fetches the completed results, and writes the human-readable insights back into the spreadsheet interface.

For developers setting this up, the repository includes a Quick Start template. The script automatically handles the initialisation of the spreadsheet structure when first opened, creating the necessary configuration and logging sheets in a safe, non-destructive way.

Overcoming the execution limit with Vertex AI batch prediction

As you can imagine, analysing hundreds of videos using a LLM is going to be a compute-heavy task. If you attempt to process these synchronously in a simple loop, you will quickly hit the execution limit for Google Apps Script.

To get around this, the solution uses Vertex AI batch prediction (see Batch inference with Gemini). Instead of waiting for Gemini to process each video, the script initiates an asynchronous batch job and uses a chained-trigger system to poll for completion. Once the batch job completes, the script retrieves this context, parses the JSON response from Gemini, and writes the summarised data back to the Google Sheet.

Summary

The YouTube Topic Insights solution is an excellent example of how to build complex applications using Vertex AI Batch Prediction. It provides practical solutions for managing asynchronous tasks, interacting with advanced Vertex AI features.

If you are interested in exploring the code further or setting up the tool for your own research, you can find the complete source code and deployment instructions in the YouTube Topic Insights GitHub repository.

Source: YouTube Topic Insights: Automated YouTube Trend Analysis with Gemini AI

Navigating AI Agent Protocols: From Connection to Skill

The growing landscape of AI agent development is overloaded with acronyms: MCP, A2A, UCP, AP2, A2UI, and AG-UI, just to name a few. If you’ve ever looked at this list of protocols and felt like you were staring at a wall of competing standards, you are not alone. To help you understand their value, we are going to demonstrate what each one does to save you from writing and maintaining custom integration code for every single tool, API, and frontend component your agent touches

In Pulse we have recently featured some excellent examples of agentic integration into Google Workspace, in particular, Pierrick Voulet’s insightful posts on extending Chat apps with Universal Actions provide a great template for building intelligent assistants. For those following Pierrick on LinkedIn you’ll know these are just the tip of the iceberg.

With users increasingly expecting these intelligent assistants to communicate seamlessly across their everyday tools, knowing the right standards to connect different systems becomes a significant challenge. Fortunately, the Google Developers Blog has published a comprehensive guide to help navigate this exact problem.

In a recent post there is an overview of the growing list of acronyms associated with AI agent development. The post breaks down six main protocols:

  • Model Context Protocol or MCP: This standardizes connection patterns for servers. Instead of writing custom API requests for every service, your agent discovers tools automatically.
  • Agent2Agent Protocol or A2A: This standardizes how remote agents discover and communicate with each other using well-known URLs.
  • Universal Commerce Protocol or UCP: This modularizes the shopping lifecycle into strongly typed schemas that remain consistent across any underlying transport.
  • Agent Payments Protocol or AP2: This adds cryptographic proof of authorization to checkout flows to enforce configurable guardrails on transactions.
  • Agent-to-User Interface Protocol or A2UI: This lets the agent dynamically compose novel layouts from a fixed catalog of safe component primitives.
  • Agent-User Interaction Protocol or AG-UI: This acts as middleware that translates raw framework events into a standardized stream for the frontend.

To illustrate these concepts, the guide walks through building a multi-step supply chain agent using the Agent Development Kit. The scenario starts with a bare large language model and progressively adds protocols until the agent can check inventory, get quotes, place orders, authorize payments, and render interactive dashboards.

Adopting standard protocols is only half the equation, and how effectively you apply them shouldn’t be overlooked. As an example, Richard Seroter recently highlighted this in his analysis of agent token consumption. He discovered that simply attaching an MCP to an agent often leads to excessive planning iterations and high token costs. By pairing an MCP with a highly focused “Skill”, a structured set of instructions that guides the tool’s application, developers can drastically cut down on wasted turns. In one test, combining an MCP with a specific skill resulted in an 87% reduction in token usage compared to letting the agent figure out the tool on its own.

For Google Workspace developers building complex integrations, it’s important not to just be aware of the emerging protocols, but also the best practices for implementing them. Giving your agent a connection is a great start; teaching it the specific skill to use that connection efficiently is essential.

Sources:

Beyond external libraries: Google Workspace document and data processing with Gemini Code Execution

Why use a dedicated app when you can simply ask Gemini to write and run the Python code for you? A look at the power of Google Apps Script and GenAI

For many Google Workspace developers, handling complex file formats or performing advanced data analysis has traditionally meant navigating the limitations of Apps Script’s built-in services. We have previously featured solutions for these challenges on Pulse, such as merging PDFs and converting pages to images or using the PDFApp library to “cook” documents. While effective, these methods often rely on loading external JavaScript libraries like pdf-lib, which can be complex to manage and subject to the script’s memory and execution limits.

While users of Gemini for Google Workspace may already be familiar with its ability to summarise documents or analyse data in the side panel, those features are actually powered by the same “Code Execution” technology under the hood. The real opportunity for developers lies in using this same engine within Apps Script to build custom, programmatic workflows that go far beyond standard chat interactions.

A recent project by Stéphane Giron highlights this path. By leveraging the Code Execution capability of the Gemini API, it is possible to offload intricate document and data manipulation to a secure Python sandbox, returning the results directly to Google Drive.

Moving beyond static logic

The traditional approach to automation involves writing specific code for every anticipated action. The shift here is that Gemini Code Execution does not rely on a pre-defined set of functions. Instead, when provided with a file and a natural language instruction, the model generates the necessary Python logic on the fly. Because the execution environment includes robust libraries for binary file handling and data analysis, the model can perform varied tasks without the developer needing to hardcode each individual routine. Notably, the model can learn iteratively which means if the generated code fails, it can refine and retry the script up to five times until it reaches a successful output.

While basic data analysis is now a standard part of Gemini for Workspace, having direct access to the library list in the Gemini sandbox opens up additional specialised, developer-focused avenues:

  • Dynamic Document Generation: Using python-docx and python-pptx, you can programmatically generate high-fidelity Office documents or presentations based on data from Google Workspace, bridging the gap between ecosystems without manual copy-pasting. [Here is a container bound script based on Stéphane code for Google Docs that generates a summary PowerPoint file]
  • Programmatic Image Inspection: Using Gemini 3 Flash, you can build tools that inspect images at a granular level. For example, a script could process a batch of site inspection photos, using Python code to “zoom and inspect” specific equipment labels or gauges, and then log those values directly into a database.

The mechanics and constraints

The bridge between Google Drive and this dynamic execution environment follows a straightforward pattern:

  1. File Preparation: The script retrieves the target file from Drive and converts the blob into a format compatible with the Gemini API.
  2. Instruction & Execution: Along with the file, a prompt is sent describing the desired outcome.
  3. The Sandbox: Gemini writes and runs the Python code required to fulfil the request.
  4. Completion: Apps Script receives the modified file or data and saves it back to the user’s Drive.

However, there are important boundaries to consider. The execution environment is strictly Python-based and has a maximum runtime of 30 seconds. Furthermore, developers should be mindful of the billing model: you are billed for “intermediate tokens,” which include the generated code and the results of the execution, before the final summary is produced.

Get started

For those interested in the implementation details, Stéphane has shared a repository containing the Apps Script logic and the specific configuration needed to enable the code_execution tool.

Source: Stéphane Giron on Medium | Code: GitHub Repository

Beyond the Sidebar: Re-imagining Google Workspace Add-ons as Gemini for Workspace Actions

The conversation around Enterprise AI is shifting. In 2024, we were preoccupied with the novelty of the prompt box, marveling at the ability of large language models to draft emails or summarise long documents. By 2026, that novelty has faded. The real frontier is no longer about asking an AI to “write this”; it is about instructing it to “execute this process.” We are witnessing the birth of a managed Action Layer within our digital environments.

The DORA Reality Check: AI as an Amplifier

To understand why this shift is happening, we must look at the data. The 2025 DORA Report provides a sobering reality check for technology leaders. Its central finding is that AI is an amplifier of your existing organisational systems. If your workflows are streamlined and your data is clean, AI accelerates excellence. However, if your processes are fragmented, AI simply creates “unproductive productivity,” generating isolated pockets of speed that eventually lead to downstream chaos.

DORA’s insights suggest that the bottleneck for AI value isn’t the model’s intelligence; it is the “Context Constraint.” AI cannot be effective if it is trapped in a silo, unable to see the broader business environment or act upon the tools that employees use every day. To move from hype to value, we must stop treating AI as a separate chatbot and start treating it as a governed co-worker.

Demystifying MCP

An emerging key component at the heart of this transition will be the Model Context Protocol (MCP). This protocol acts as a universal bridge that allows AI models to connect directly with data sources and local tools without the need for custom code for every new integration. It works by standardising how a model requests information from a database, a file system, or a specific software application. Because this protocol creates a common language between the AI and the external world, developers can swap different models or data sets in and out of their workflows with much less friction.

We are moving away from the era of closed silos. Instead, we see a future where an AI assistant can securely reach into your local environment to read a specific document or query a live database using a single, unified interface. It is a fundamental shift in how we think about connectivity.

Two Philosophies of Action: Managed Features vs. Configurable Platforms

Given the scale and pace of change, it is not surprising that Google’s AI efforts are not singular. The reality is that Gemini Enterprise (the platform formerly known as Agentspace) and Gemini for Workspace are two distinct products developed by separate teams with very different philosophies of control.

Gemini for Workspace primarily operates as a “managed service.” We see a steady expansion of features, but these remain “black boxes” entirely defined and controlled by Google. The user receives the feature, but the administrator has very little say in how that tool is connected, calibrated, or exposed to the model’s reasoning engine. Gemini Enterprise, by contrast, is a more configurable platform. It allows organisations to define their own data connectors, tailor system prompts and even deploy enterprise-owned agents.

From Black Boxes to Federated Actions

Whilst Google continue to expand the actions Gemini for Workspace can perform in products like Google Sheets, these tools remain non-configurable for the domain administrator. The January 23, 2026, Gemini Enterprise release reveals a more radical direction: the expansion of Configurable Actions.

Whilst the internal implementation details of every Gemini feature remain proprietary, Google’s recent announcement of official MCP support suggests that the protocol is becoming the standard connective tissue for their entire AI stack. The Enterprise team is simply the first to expose this plumbing for organisational governance and configuration. To understand the practical power of this protocol when it is stripped of its corporate packaging, we only need look at how MCP is being used in other generative AI tooling.

Lessons from the Gemini CLI

While most Workspace users interact with AI through polished side panels and the Gemini App, some power users are turning to the Gemini CLI. This command-line interface is the playground often being used to explore the automation of AI-driven workflows.

Unlike the “black box” nature of Gemini Enterprise and Gemini for Workspace, the Gemini CLI is easy to extend and customise. Users can easily install and create Extensions which can wrap commands, system prompts and MCP tools. The Google Workspace Extension is a wonderful demonstration of the power of MCP. By providing a few lines of configuration, a user grants the model permission to query and manage core productivity data: it can search files in Drive, draft emails in Gmail, dispatch Chat alerts, or organise Calendar appointments directly from the command line.

The value of the CLI lies in its demonstration that generative AI is capable of much more than just summarisation and analysis. By extending its capabilities through MCP, a user can hand-pick the actions they need based on their specific persona and the systems they inhabit. This shift toward action-oriented tools allows the model to move from passive reasoning to active execution without the user ever leaving their primary environment.

The Aspiration: A Managed Action Marketplace

The challenge now is to take this raw, unfiltered power seen in the CLI and translate it into a governed enterprise experience. Individual productivity is a start, but true organisational scale requires a bridge between these flexible protocols and the managed safety of the Workspace environment. I believe the defining moment for enterprise productivity will arrive when Google closes this gap by utilising its existing, battle-tested infrastructure.

For Gemini for Workspace to truly align with the DORA findings on systemic stability, it must move away from the current “one-size-fits-all” managed feature-set and toward a framework of granular administrative control. This is the aspiration for a truly professional Action Layer.

The Google Workspace Add-on Manifest is an existing way for enterprises to customise Google Workspace. Today, developers use a manifest to define which products an add-on supports, such as Gmail or Docs. To enable the next wave of productivity, Google could add the Gemini App (gemini.google.com) as a first-class, configurable product within this manifest. This would effectively turn the “Add-on” from a static sidebar tool into a dynamic Action that Gemini can call upon.

In this vision, a developer defines the location of their MCP server directly within the manifest. This creates a standardised way to deliver agentic tools through the existing Marketplace. This is not about building another static side panel. Rather, it is about injecting new capabilities directly into the core reasoning engine of the Gemini App and the integrated Gemini side panels across Workspace.

Imagine a “Tableau for Workspace” add-on. Instead of a user switching tabs to query a dashboard, the add-on uses the Tableau MCP toolset to allow Gemini to “see” live visualisations. The user can simply ask their Gemini side panel in Google Docs to “Insert a summary of our regional performance from Tableau,” and the agent executes the query, interprets the data, and drafts the text in situ.

This transformation would provide the necessary governance layer to ensure that AI-accelerated workflows remain under human control. By allowing admins to curate and configure these agentic tools, Google can turn Workspace into a high-quality internal platform. It would magnify the strengths of its users without introducing the chaos of ungoverned automation. The path to enterprise productivity is paved by protocol, not just better models. The winners of the next decade will not be the companies that buy the most AI licenses, but those that have the best platforms for those AI agents to inhabit.

Vertex AI Advanced Service in Apps Script: A Step Forward or a Missed Opportunity?

As Google Apps Script developers, we are used to waiting. We wait for new runtime features, we wait for quotas to reset, and recently, we have been waiting for a first-class way to integrate Gemini into our projects.

With the recent release of the Vertex AI Advanced Service, the wait is technically over. But as detailed in Justin Poehnelt’s recent post, Using Gemini in Apps Script, you might find yourself asking if this was the solution we were actually looking for.

While the new service undoubtedly reduces the boilerplate code required to call Google’s AI models, it brings its own set of frustrations that leave me, and others in the community, feeling somewhat underwhelmed.

The “Wrapper” Trap

On the surface, the new VertexAI service looks like a win. As Justin highlights, replacing complex UrlFetchApp headers with a single VertexAI.Endpoints.generateContent() call is a significant cleanup.

However, this convenience comes with an administrative price tag. The Vertex AI Advanced Service requires a standard Google Cloud Project, understandable for billing, but requires the creation of an oAuth consent screen. For the majority of internal enterprise applications, I would imagine either a service account or a https://www.googleapis.com/auth/cloud-platform scope and associated IAM will be the preferred approach. This removes the need for a consent screen and, in the case of Service Accounts, rules out the Vertex AI Advanced Service.

It begs the question: Why didn’t Google take the approach of the Google Gen AI SDK?

In the Node.js and JavaScript world, the new Google Gen AI SDK offers a unified interface. You can start with a simple API key (using Google AI Studio) for prototyping, and switch to Vertex AI (via OAuth) for production, all without changing your core code logic. The Apps Script service, by contrast, locks us strictly into the “Enterprise” Vertex path. We seem to have traded boilerplate code for boilerplate configuration.

A Third Way: The Community Approach

If you are looking for that Unified SDK experience I mentioned earlier, where you can use the standard Google AI Studio code patterns within Apps Script, there is a third way.

I have published a library, GeminiApp, which wraps UrlFetchApp but mimics the official Google Gen AI SDK for Node.js. This allows you to write code that looks and feels like the modern JavaScript SDK, handling the complex UrlFetchApp configuration under the hood.

As you can see in the comparison above, the Advanced Service (left) abstracts away the request complexity, the UrlFetchApp method (middle) gives you the transparency and control you often need in production, and the GeminiApp library (right) offers a balance of both.

Disclaimer: As the creator of this library, I admit some bias, but it was built specifically to address the gap.

It is important to note a distinction in scope. Both the Google Gen AI SDK and GeminiApp are focused strictly on generative AI features. The Vertex AI Advanced Service, much like the platform it wraps, offers a broader range of methods beyond just content generation.

If your needs extend into those wider Vertex AI capabilities, but you still require the authentication flexibility of UrlFetchApp (such as using Service Accounts), I have a solution for that as well. My Google API Client Library Generator for Apps Script includes a build for the full Vertex AI (AI Platform) API. This gives you the comprehensive coverage of the Advanced Service with the architectural flexibility of an open-source library.

Here is how you can use the generated client library to authenticate with a Service Account, something impossible with the official Advanced Service:

/**
 * Example using the generated Aiplatform library with a Service Account.
 * Library: https://github.com/mhawksey/Google-API-Client-Library-Generator-for-Apps-Script/tree/main/build/Aiplatform
 */
function callGemini(prompt) {
  const projectId = 'GOOGLE_CLOUD_PROJECT_ID';
  const region = 'us-central1';
  const modelName = 'gemini-2.5-flash';

  const modelResourceName = `projects/${projectId}/locations/${region}/publishers/google/models/${modelName}`;

  const serviceAccountToken = getServiceAccountToken_(); 

  const vertexai = new Aiplatform({
    token: serviceAccountToken
  });

  const payload = {
    contents: [{
      role: 'user',
      parts: [{
        text: prompt
      }]
    }],
    generationConfig: {
      temperature: 0.1,
      maxOutputTokens: 2048
    }
  };

  const result = vertexai.projects.locations.publishers.models.generateContent({
    model: modelResourceName,
    requestBody: payload
  });

  return result.data.candidates?.[0]?.content?.parts?.[0]?.text || 'No response generated.';
}

When “Advanced” Means “Behind”

There is another catch that Justin uncovered during his testing: the service struggles with the bleeding edge.

If you are trying to access the latest “Preview” models to prototype, such as the highly anticipated gemini-3-pro-preview, the advanced service may fail you. It appears the wrapper doesn’t yet support the auto-discovery needed for these newer endpoints.

In his companion post, UrlFetchApp: The Unofficial Documentation, Justin reminds us why UrlFetchApp is still the backbone of Apps Script development. When the “official” wrapper doesn’t support a specific header or a beta model, UrlFetchApp is the only way to bypass the limitations.

The Verdict

The Vertex AI service is a welcome addition for stable, enterprise-focused applications. But for developers, particularly those who want to test the latest Gemini 3 capabilities, it feels rigid compared to the flexibility seen in other Google developer ecosystems.

It serves as a good reminder that in Apps Script, convenience services are great, but understanding the underlying HTTP requests via UrlFetchApp  extends what you can achieve.

The Age of the Agent: Google’s Managed MCP Servers and Community Contributions

Image credit: Kanshi Tanaike

 

The “USB-C for AI” has officially arrived at Google.

If you have been following the rapid rise of Anthropic’s Model Context Protocol (MCP), you know it promises a standardised way for AI models to connect to data and tools. Until now, connecting an LLM to your Google Drive or BigQuery data meant building custom connectors or relying on fragmented community solutions.

That changed this week with Google’s announcement of fully managed, remote MCP servers for Google and Google Cloud services. But typically for the Google Workspace developer community, we aren’t just waiting for official endpoints—we are building our own local solutions too.

Here is an overview of the different approaches emerging for MCP tools and what the future looks like for Workspace developers.

The Official Path: Managed MCP Servers

Google’s announcement represents a massive shift in how they view AI agents. Rather than asking developers to wrap every granular API endpoint (like compute.instances.insert) into a tool, Google is releasing Managed MCP Servers.

Think of these as “Client Libraries for Agents.” Just as Google provides SDKs for Python or Node.js to handle the heavy lifting of authentication and typing, these managed servers provide a simplified, reasoned interface for agents.

The announcement lists a massive rollout over the “next few months” covering everything from Cloud Run and Storage to the Android Management API. The goal described is to make MCP the standard way agents connect to the entire Google ecosystem. To start with Google announced MCPs for:

  • Google Maps: instead of just raw data, the agent gets “Grounding Lite” to accurately answer location queries without hallucinating.
  • BigQuery: The agent can interpret schemas and execute queries in place, meaning massive datasets don’t need to be moved into the context window.
  • Google Kubernetes Engine (GKE): Agents can diagnose issues and optimize costs without needing to string together brittle CLI commands.

The Community Path: Privacy-First and Scriptable

While managed servers are excellent for scale and standardisation, there are some other notable approaches using local execution.

1. The Gemini CLI Workspace Extension As detailed in a recent tutorial by Romin Irani, the official google-workspace extension for the Gemini CLI takes a privacy-first approach. Instead of running in the cloud, the MCP server runs locally on your machine.

It uses your own local OAuth token, meaning your data flow is strictly Google Cloud <-> User's Local Machine <-> Gemini CLI. This is ideal for tasks like searching Drive, checking your Calendar, or drafting Gmail replies where you want to ensure no third-party intermediary is processing your data.

2. The Power of Apps Script (gas-fakes) For true customisability, Kanshi Tanaike has been pushing Google Apps Script. A recent proof-of-concept using gas-fakes CLI demonstrates how to build MCP tools directly using Apps Script.

This is a big step for Apps Script developers. It allows you to write standard GAS code (e.g., DriveApp.getFilesByName()) and expose it as a tool to an MCP client like Gemini CLI or Google Antigravity. Because it uses the gas-fakes sandbox, you can execute these scripts locally, bridging the gap between your local AI agent and your cloud-based Workspace automation.

The Future: A Universal MCP Strategy?

We are likely moving toward a world where every Google product launch includes three things: API documentation, Client SDKs, and an Official MCP Server Endpoint.

For developers, the choice will come down to granularity and control:

  • Use Managed Servers when you need reliable, standard access to robust services like BigQuery or Maps with zero infrastructure overhead.
  • Use Community/Local Tools (like the Workspace extension or gas-fakes) when you need rapid prototyping, deep customization, or strict data privacy on your local machine.

The “Agentic” future is here, and whether you are using Google’s managed infrastructure or writing your own tools, the ecosystem is ready for you to build.

Sources:

Mastering Workspace API Authentication in ADK Agents with a Reusable Pattern

For developers seasoned in the Google Workspace ecosystem, the promise of agentic AI is not just about automating tasks, but about creating intelligent, nuanced interactions with the services we know so well. Google’s Agent Development Kit (ADK) provides the foundation, but integrating the familiar, user-centric OAuth 2.0 flow into this new world requires a deliberate architectural approach.

This article explores the patterns for building a sophisticated, multi-capability agent. It explores why you might choose a custom tool implementation (ADK’s “Journey 2”) over auto-generation, and presents a reusable authentication pattern that you can apply to any Workspace API.

The full source code for this project is available in the companion GitHub repository. It’s designed as a high-fidelity local workbench, perfect for development, debugging, and rapid iteration.

The Architectural Crossroads: Journey 1 vs. Journey 2

When integrating a REST API with the ADK, you face a choice, as detailed in the official ADK authentication documentation:

  • Journey 1: The “Auto-Toolify” Approach. This involves using components like OpenAPIToolset or the pre-built GoogleApiToolSet sets. You provide an API specification, and the ADK instantly generates tools for the entire API surface. This is incredibly fast for prototyping but can lead to an agent that is “noisy” (has too many tools and scopes) and lacks the robustness to handle API-specific complexities like pagination.
  • Journey 2: The “Crafted Tool” Approach. This involves using FunctionTool to wrap your own Python functions. This is the path taken in this project. It requires more initial effort but yields a far superior agent for several reasons:
    • Control: This approach exposes only the high-level capabilities needed (e.g., search_all_chat_spaces), not every raw API endpoint.
    • Robustness: Logic can be built directly into the tools to handle real-world challenges like pagination, abstracting this complexity away from the LLM.
    • Efficiency: The tool can pre-process data from the API, returning a concise summary to the LLM and preserving its valuable context window.

Journey 2 was chosen because the goal is not just to call an API, but to provide the agent with a reliable, intelligent capability.

The Core Implementation: A Reusable Pattern

The cornerstone of this integration is a single, centralised function: get_credentials. This function lives in agent.py and is called by every tool that needs to access a protected resource.

It elegantly manages the entire lifecycle of an OAuth token within the ADK session by following a clear, four-step process:

  1. Check Cache: It first looks in the session state (tool_context.state) for a valid, cached token.
  2. Refresh: If a token exists but is expired, it uses the refresh token to get a new one and updates the cache.
  3. Check for Auth Response: If no token is found, it checks if the user has just completed the OAuth consent flow using tool_context.get_auth_response().
  4. Request Credentials: If all else fails, it formally requests credentials via tool_context.request_credential(), which signals the ADK runtime to pause and trigger the interactive user flow.

This pattern is completely generic. You can see in the agent.py file how, by simply changing the SCOPES constant, you could reuse this exact function for Google Drive, Calendar, Sheets, or any other Workspace API.

Defining the Agent and Tools

With the authentication pattern defined, building the agent becomes a matter of composition. The agent.py file also defines the “Orchestrator/Worker” structure—a best practice that uses a fast, cheap model (gemini-2.5-flash) for routing and a more powerful, specialised model (gemini-2.5-pro) for analysis.

The Other Half: The Client-Side Auth Flow

The get_credentials function is only half the story. When it calls tool_context.request_credential(), it signals to the runtime that it needs user authentication. The cli.py script acts as that runtime, or “Agent Client.”

The cli.py script is responsible for the user-facing interaction. As you can see in the handle_agent_run function within cli.py, the script’s main loop does two key things:

  1. It iterates through agent events and uses a helper to check for the specific adk_request_credential function call.
  2. When it detects this request, it pauses, prints the authorisation URL for the user, and waits to receive the callback URL.
  3. It then bundles this callback URL into a FunctionResponse and sends it back to the agent, which can then resume the tool call, this time successfully.

This client-side loop is the essential counterpart to the get_credentials pattern.

Production Considerations: Moving Beyond the Workbench

This entire setup is designed as a high-fidelity local workbench. Before moving to a production environment, several changes are necessary:

  • Persistent Sessions: The InMemorySessionService is ephemeral. For production, you must switch to a persistent service like DatabaseSessionService (backed by a database like PostgreSQL) or VertexAiSessionService. This is critical for remembering conversation history and, most importantly, for securely storing the user’s OAuth tokens.
  • Secure Credential Storage: While caching tokens in the session state is acceptable for local testing, production systems should encrypt the token data in the database or use a dedicated secret manager.
  • Runtime Environment: The cli.py would be replaced by a web server framework (like FastAPI or Flask) if you are deploying this as a web application, or adapted to the specific requirements of a platform like Vertex AI Agent Engine.

By building with this modular, Journey 2 approach, you create agents that are not only more capable and robust but are also well-architected for the transition from local development to a production-grade deployment.

Explore the Code

Explore the full source code in the GitHub repository to see these concepts in action. The README.md file provides all the necessary steps to get the agent up and running on your local machine. By combining the conceptual overview in this article with the hands-on code, you’ll be well-equipped to build your own robust, production-ready agents with the Google ADK.

Source: GitHub – mhawksey/ADK-Workspace-User-oAuth-Example

Building Agentic Workflows: A 3-Part Series from Aryan Irani on ADK, Vertex AI, and Google Apps Script Tags: ADK, Vertex AI, Apps Script, Gemini, AI, Agents

 

Thanks to Google’s Agent Development Kit (ADK) and Vertex AI’s Agent Engine, that’s no longer just an idea. These frameworks let you build intelligent agents, deploy them at scale, and integrate them seamlessly into your Google Workspace tools - enabling a new era of agentic productivity.

In this three-part tutorial series, we’ll explore exactly how to do that.

The ability to build and integrate custom AI agents directly into Google Workspace is rapidly moving from a far-off idea to a practical reality. For developers looking for a clear, end-to-end example, Google Developer Expert Aryan Irani has published an excellent three-part tutorial series.

This comprehensive guide, which includes video tutorials for each part, offers a complete end-to-end journey for building agentic workflows. It starts by building a local ‘AI Auditor’ agent with the new Agent Development Kit (ADK)—programming it to verify claims with the Google Search tool. The series then guides developers through the entire cloud deployment process to Vertex AI Agent Engine, covering project setup and permissions. Finally, it ties everything together with Google Apps Script, providing the complete code (using UrlFetchApp and the OAuth2 for Apps Script library) to integrate the agent as a powerful, fact-checking tool inside Google Docs.

Why This Series Matters

This series is a fantastic, practical guide for any Google Workspace developer looking to bridge the gap between classic Apps Script solutions and Google’s powerful new generative AI tools. It provides a complete, end-to-end blueprint for building truly ‘agentic’ workflows inside the tools your organisation already uses.

A big thank you to Aryan Irani for creating and sharing this detailed resource with the developer community!

Sources:

Automate In-Depth Research with Apps Script and a Gemini AI ‘Research Team’

Automate research with this script. It turns a Google Sheet into an AI that plans, researches, and writes detailed reports on any topic.

The Gemini App (gemini.google.com) has a sophisticated Deep Research agent whose capabilities are incredibly powerful, but they are designed to tackle one complex topic at a time. The manual process of running each query and collating the results can however be very slow.

What if you could deconstruct the capabilities of Deep Research and build a similar agent which is instead orchestrated from a Google Sheet? Thankfully, Google Workspace Developer Expert, Stéphane Giron, has designed and shared an elegant solution that transforms Google Sheets and Apps Script into a powerful, automated, bulk research assistant.

Stéphane’s approach is built on a “Divide and Conquer” strategy, which he outlines in his detailed article on the project.

Instead of just asking a single AI to answer a complex question, this script acts as a manager for a specialised AI team… The process is broken down into three distinct phases:

  1. Phase 1: The Strategist (Plan): A powerful AI model analyses your main query and breaks it down into a logical plan of smaller, essential sub-questions.
  2. Phase 2: The Researcher (Execute): A fast, efficient AI model takes each sub-question and uses targeted Google searches to find factual, concise answers.
  3. Phase 3: The Editor (Synthesise): The strategist AI returns to act as an editor, weaving all the collected research and data into a single, cohesive, and well-written final report.

How the AI Research Team Works

The solution, which is available in full on GitHub, uses a clever combination of different Gemini models and tools, all orchestrated by Google Apps Script:

  • Input: The user simply lists all their research topics in a Google Sheet (e.g., in column A).
  • Phase 1 (Plan): The script loops through each topic. For each one, it calls the Gemini 2.5 Pro model, treating it as the “Strategist.” It uses Gemini’s function-calling capability to force the model to output a structured plan of sub-questions needed to answer the main query.
  • Phase 2 (Execute): The script then takes this list of sub-questions and calls the Gemini 2.5 Flash model for each one, treating it as the fast “Researcher.” This call uses the built-in Google Search tool (grounding) to find up-to-date, factual answers for each specific sub-question.
  • Phase 3 (Synthesise): Finally, with all the collected research, the script calls Gemini 2.5 Pro one last time. In this “Editor” role, the model receives the original query and all the question/answer pairs, synthesising them into a single, comprehensive report.
  • Output: The script creates a new Google Doc for this final report and places a link to it directly in the Google Sheet, next to the original topic.

This “AI team” approach is a fascinating pattern we’re beginning to see emerge in the community. It strongly echoes the “AI Scrum Master” concept shared by Jasper Duizendstra at the recent Google Workspace Developer Summit in Paris. Both solutions smartly move away basic prompting and instead orchestrate a team of specialised AIs, leading to a far more robust and scalable workflow.

Stéphane’s script is highly customisable, allowing you to set the number of sub-questions to generate, define the output language, and pass the current date to the models to ensure the research is timely.

This is a fantastic example of how to build sophisticated, autonomous AI agents inside Google Workspace. A big thank you to Stéphane Giron for sharing this project with the community.

Get Started

You can find the complete code, setup instructions, and a deeper dive into the architecture at the links below:

Source: Bulk Deep Research with Gemini and Google Apps Script + Code Repository: Bulk-Deep-Research on GitHub