AppsScriptPulse

Go Beyond the Gemini App: Building Custom AI Tools with Gemini CLI and Apps Script

This article explores the integration of the Gemini Command-Line Interface (CLI) with Google Sheets using the Model Context Protocol (MCP). It demonstrates how to leverage the open-source projects MCPApp and ToolsForMCPServer to create a bridge between the Gemini CLI and Google Workspace. This enables users to perform powerful data automation tasks, such as creating, reading, and modifying tables in Google Sheets directly from the command line, using natural language prompts. The article provides practical examples and sample prompts to illustrate the seamless workflow and potential for building sophisticated, AI-powered applications within the Google Cloud ecosystem.

For developers who work in the command line, the introduction of Google’s Gemini Command-Line Interface (CLI) has opened up new ways to interact with AI. The open-source tool allows you to bring the power of Gemini models directly into your terminal, making it a useful companion for coding assistance, content generation, and task automation. While the CLI can interact with your local file system and the internet, its standard capabilities don’t provide a direct path to your personal data within Google Workspace.

But what if you could extend it? What if you could create a secure connection, allowing the Gemini CLI to interact with your own files in Drive, events in Calendar, or emails in Gmail?

Over the past month, Google Workspace Developer Expert Kanshi Tanaike has published an insightful series of blog posts that provides a practical blueprint for achieving just that. In this series, he demonstrates how to use the flexibility of Google Apps Script to create a powerful, personalised bridge between the Gemini CLI and the entire suite of Google Workspace services.

At the centre of Kanshi’s solution is a simple and effective architecture. He shows readers how to build a Model Context Protocol (MCP) server using a Google Apps Script project deployed as a Web App. This server acts as the secure intermediary between the Gemini CLI and your Google account.

The key advantage of this approach lies in its security and simplicity. By using Apps Script, the complex and critical OAuth2 authorisation is handled natively within the Google ecosystem. This means developers can create a robust tool without the overhead of managing credentials or setting up separate cloud infrastructure.

Throughout the series, Kanshi doesn’t just provide the theory; he delivers the tools to make it happen. He has developed and shared two key open-source Apps Script libraries, MCPApp and ToolsForMCPServer, which significantly simplify the process. These libraries provide a ready-to-use framework and a rich set of functions for interacting with Workspace services.

The result is a system capable of executing complex, multi-step automations from a single, natural-language prompt. The examples in his posts speak for themselves:

  • Automated Data Entry: Fetching a multi-day weather forecast and populating it directly into a Google Sheet.
  • Content Creation: Uploading a local PDF to Drive, generating a summary, creating a new Google Slides presentation from that summary, and emailing it to a colleague.
  • Interactive Tasks: Generating a custom survey in Google Forms and sending out the link via Gmail.

This extensibility is what sets the approach apart. While a standard tool like the Gemini App (gemini.google.com) provides a familiar conversational interface, it offers a fixed set of features. This CLI-based method allows developers to keep that same intuitive, conversational style of interaction but makes it completely extensible. With the CLI and a custom MCP server, developers are no longer limited—they can build bespoke tools to solve their specific challenges.

For example, a common request from Workspace users is the ability to generate an entire presentation from source material. Using this framework, a developer could easily build a custom tool to do just that, triggered by a single command.

This series is a great example of the innovation happening within the Google Workspace developer community. While this command-line approach is naturally suited for developers and power users comfortable with scripting, it offers a level of control and automation that is difficult to achieve through graphical interfaces. It provides a clear, powerful, and extensible pattern for anyone in this group looking to explore their own sophisticated AI-powered automations.

We highly recommend diving into the series to see what’s possible. You can start with the most recent post, which links back to the previous articles in the series:

Source: Next-Level Data Automation: Gemini CLI, Google Sheets, and MCP

More than meets the AI: How AppSheet and the Gemini API can transform businesses

Image credit: Google

I recently explored the transformative power of Generative AI (GenAI) and how it’s reshaping the business landscape in a thought leadership piece for my employer, Appsbroker | CTS. Drawing from my experience in Gemini for Workspace pilots and custom GenAI solutions using the Gemini API in Vertex AI, I highlighted how AppSheet, a no/low-code platform, can be a game-changer.

AppSheet simplifies the integration of GenAI capabilities, enabling rapid prototyping and tailored solutions that deliver real-world results, whilst still benefiting from integrations into Google Workspace. I believe it’s a cost-effective and impactful way to harness GenAI’s potential without blowing the budget. In the article, I share some examples and the benefits of using AppSheet for GenAI innovation.

Curious to learn more? Join me at the Google Workspace Developer Summit in Berlin on September 17th, where I’ll be discussing AppSheet integrations, Gemini Function Calling, and more.

Source: More than meets the AI: why planning ahead is vital to reap the rewards of GenAI

Using Google Forms and the Gemini API to automate creation of multiple choice questions (MCQs)

This report proposes a novel learning method using Gemini to automate Q&A generation, addressing the challenges of manual Q&A creation. By integrating with Google tools, this approach aims to enhance learning efficiency, accessibility, and personalization while reducing costs.

The rapid advancement of technology has offered both opportunities and challenges to the education sector. While technology can be a valuable tool for supporting teaching and learning, concerns about its appropriate use have existed for a long time.

The education sector is witnessing an increase in AI tools, each promising to enhance teaching and learning. However, the quality of these tools varies significantly. Some are better designed and can – if used appropriately – can be beneficial, while others are poorly designed. This can make it challenging for educators to find the right solutions that meet their specific needs.

This blog post by Kanshi Tanaike sheds light on the inner workings of some commercial AI tools developed for educators. It demonstrates how these tools can use generative AI to create multiple-choice questions (MCQs) on a given topic. In this particular solution, Google Forms are used in the process, with questions and answers automatically generated by the Gemini API for the user to answer in a Google Form.

For educators who have experience with Google Apps Script, this project looks like a useful starting point to refine and create your own solution or simply used to gain insights into the functioning of similar commercial tools.

Source: A Novel Approach to Learning: Combining Gemini with Google Apps Script for Automated Q&A

Smart replacing images in Google Slides with Gemini Pro API and Vertex AI

Image credit: Ivan Kutil

Surely, you have also experienced having a presentation in which you needed to replace old content with new. Replacing text is very simple because you just need to use the Replace function, and you can either do it in the Google Slides user interface.

The problem arises when you need to replace one image with another, for example, if your corporate logo is updated to a new graphic design or if one of your favorite cloud services updates its icons (Gmail, blink blink ;-) It’s still somewhat bearable with one presentation, but what do you do when, like me, you have thousands of Google Slides files on your Google Drive?

This post explores a clever application of the Gemini API’s multimodal capabilities, created by Ivan Kutil. His code utilizes GenAI to automatically detect outdated logos within your Google Slides presentations. The original blog post (March 2024) used the Gemini Pro Vision API. As a sign of how rapidly this area evolves, Google now recommends switching to Gemini 1.5 Flash or Gemini 1.5 Pro.

Switching to the newer APIs is very straightforward as all you need to do is search the source code for models/gemini-pro-vision and replace with models/gemini-1.5-flash or models/gemini-1.5-pro.

For Apps Script projects I lean towards Gemini 1.5 Flash as it is designed for speed. I’m also always looking to opportunities to test the GeminiApp library for Apps Script and very quickly I was able to fork Ivan’s code and use Gemini 1.5 Flash using a service account.

While GeminiApp requires some initial setup, it offers significant advantages:

  • Easy Model Testing: Experiment with various models quickly.
  • Built-in Features: Includes functionalities like exponential backoff.

If you would like to explore here is a sample slide deck you can copy which has the container bound forked code (if you don’t want to use a service account here is more information on other setup options).

Source: Smart replacing images in Google Slides with Gemini Pro API and Vertex AI

Answer questions based on Google Chat conversations with a Gemini API powered Chat app

This tutorial shows how to make a Google Chat app that answers questions based on conversations in Chat spaces with generative AI powered by Vertex AI with Gemini. The Chat app uses the Google Workspace Events API plus Pub/Sub to recognize and answer questions posted in Chat spaces in real time, even when it isn’t mentioned.

Here is another great tutorial from the Google Chat DevRel team, this time showcasing how the Google Workspace Events API and some new Google Chat UI elements can be used to turn a Chat space into a Gemini Pro powered knowledge base.

The sample solution will let you consume your Chat space message history into a Firestore database. The Chat app is an intelligent agent that can then monitor for new questions and make suggestions using Gemini to generate content based on the previous messages.

There is quite a bit of setup required as part of this tutorial, but it provides a solid foundation for quickly scaling this to your needs.  A demonstration and explanation of the sample app was given as part of the Google Chat apps and APIs: Build connected workflows for the hybrid workplace session at Google Cloud Next and a recording might be available soon!

Source: Answer questions based on Chat conversations with a Gemini AI Chat app  |  Google Chat  |  Google for Developers

I challenged Gemini to a game of battleship in Google Sheets. Here’s what happened 🚢

 

I tried to see if an AI that’s good at writing could also make smart moves in a game. I chose Battleship and set it up in Google Sheets to play against Gemini, the AI. The result was mixed. On one hand, yes, Gemini could play the game. It followed the basic rules and even managed to sink some of my ships. This was a big deal, especially since it took me a ridiculous number of days of coding to get there, and I nearly gave up at one point.

Dmitry Kostyuk has shared a blog post detailing his experiment pitting the Gemini API  against himself in a game of Battleship. As explained by Dmitry  while Gemini could follow the rules and even sink some ships, it needed help to avoid basic mistakes, revealing that AI still has room to grow in the realm of strategic games.

Dmitry built the game in Google Sheets and the source code is linked from the post. To guide Gemini, Kostyuk crafted detailed prompts outlining the game’s mechanics and decision-making logic. However, he encountered challenges due to Gemini’s limitations in providing strategic responses. Despite these hurdles, the project yielded valuable insights into prompting techniques for AI decision-making.

Source: I Challenged Gemini to a Game of Battleship. Here’s What Happened.

Boost your presentations with AI: A Google Apps Script tutorial for a Google Slides reviewer assistant

The Slides Advisor project is an open source, Google Workspace Add-On that uses artificial intelligence (AI) technology to review and give feedback on your presentations, whenever you need it. The project uses Google’s Gemini API image and text processing features to analyze your Google Slides against a set of guidelines and lets you know how you are doing against those guidelines. Even better, you can customize those guidelines to follow your organization’s recommendations, or remind you to improve your presentations based on rules you define for yourself.

We’ve all been there: scrambling to cram too much information onto a single slide. But what if you had an AI assistant to give your presentations a once-over?

The Slides Advisor is a free, open-source Google Workspace Add-on for Google Slides that uses the power of AI to analyze your presentations. It checks your slides against customisable guidelines, helping you ensure they’re clear, concise, and visually appealing.

Everything you need to get started is covered in the source post by Joe Fernandez and Steve Bazyl. One of the highlights for me is seeing how you can setup a service account to access the Gemini AI API with Google Apps Script. The post also covers how you can modify the prompt you use to get different responses from Gemini.

Source: Build an AI Slides Reviewer with Gemini  |  Google AI for Developers

How to automatically rename files in Google Drive with Apps Script and the Vertex AI Gemini API

Quickly rename files in Google Drive with Apps Script and Google Gemini AI. The script will automatically rename the files with a descriptive name based on the image content.

Amit Agarwal is celebrating 20 years of ‘Digital Inspiration’, one of my go to resources for Google Workspace tech tips, tutorials and how-to guides. This recent post from Amit is a great example of the types of solutions he regularly shares. This time Amit is looking at how the Generative AI capabilities of the Gemini API can be used to help rename images in Google Drive. To achieve this there Amit shares handy Apps Script snippets for finding images in a Google Drive folder, grabbing the thumbnail images as Base64 encoded strings before calling the Gemini Pro Vision API to get suggested filenames with the prompt:

Analyze the image content and propose a concise, descriptive filename in 5-15 words without providing any explanation or additional text. Use spaces instead of underscores.

For this solution an API key for Gemini Pro Vision is generated in Google AI Studio (formerly MakerSuite). There are still geographic restrictions on access to Google AI Studio, but given Apps Script runs on Google servers once you have a API key you can continue to use it without having to proxy a location.

Image credit: Amit Agarwal

Adding image capabilities to the GeminiApp Apps Script library has been on my TODO. Now I’ve got an example to play with it should be easier to do .. thank you Amit Agarwal.

Source: How to Automatically Rename Files in Google Drive with Apps Script and AI – Digital Inspiration

Automatically creating descriptions of files in Google Drive using Gemini Pro API and Google Apps Script

Gemini LLM, now a Vertex AI/Studio API, unlocks easy document summarization and image analysis via Google Apps Script. This report details an example script for automatically creating the description of the files on Google Drive and highlights seamless integration options with API keys.

In this blog post, Kanshi Tanaike shows how you can automatically create descriptions for files on Google Drive using the Gemini Pro API with Google Apps Script. The post includes a step-by-step guide on how to set up and use the Gemini Pro API by generating a key is Google AI Studio (formerly Maker Suite). There are geographic restrictions on Google AI Studio, but you can call Gemini Pro from a Google Cloud project with a little more setup (a previous post sharing Tutorial: Respond to incidents with Google Chat, Vertex AI, and Apps Script outlines a general approach for connecting Apps Script to Google Vertex AI services.)

The post includes a couple of examples showing how Gemini can be used to provide responses to both text and visual prompts. Gemini Pro is still in public preview and as Kanshi Tanaike highlights rate limiting will likely prevent putting these solutions into production just now. At this point hopefully there is enough to start experimenting with Google Workspace integrations to Gemini.

Source: Automatically Creating Descriptions of Files on Google Drive using Gemini Pro API with Google Apps Script