If you have domain-specific knowledge that you want an LLM to leverage, you probably have a use case for fine-tuning. Fine-tuning can significantly improve how well the model understands and responds to your queries, whether it’s legal documents, medical texts, financial reports, or niche industry data.
The most crucial step in this process is structuring your data correctly. If your dataset is well-organized and formatted properly, the rest of the workflow becomes much more manageable. From there, it’s just a matter of setting up a few configurations and automating parts of the process with Apps Script. That’s where things get interesting and surprisingly efficient.
Tired of manually processing invoices? I recently built a demo that automates this tedious task using the power of Gemini, AppSheet’s new Gmail integration, and a custom Apps Script library. Let me show you how it works!
Here’s the setup:
Gmail Integration: AppSheet’s new Gmail integration allows you to trigger automations based on incoming emails. I’ve set it up to monitor a specific Gmail label called “Invoices”.
Apps Script Library: Using my “GeminiApp” (available on Github) which simplifies the interaction with Google’s Gemini AI models directly from Apps Script. This library handles the heavy lifting of making API calls and managing responses.
Controlled Generation: Gemini’s “Controlled Generation” feature lets me define a JSON schema that structures the AI’s output. This is key for extracting invoice data in a consistent, machine-readable format.
The Workflow:
Invoice Received: When an invoice email arrives and is labelled “Invoices”, AppSheet’s Gmail integration kicks in.
Automation Triggered: AppSheet triggers an automation that calls a custom Apps Script function called jsonControlledGeneration.
Data Extraction: The jsonControlledGeneration function uses the GeminiApp library to send the email body to Gemini with a predefined JSON schema for invoice data.
Structured Output: Gemini processes the email content and extracts the relevant invoice details (e.g., invoice number, supplier name, date, amount) in a JSON format that adheres to the schema.
Downstream Processing: The structured JSON output can then be easily returned to the AppSheet automation for further actions, such as automatically populating your data table, updating a database, or triggering a payment process.
If you would like to try this yourself, instructions are provided below.
Want to try it yourself?
To set this demo up you will either need a Google AI Studio API key or a Google Cloud project with Vertex AI enabled. Information on both these setups is included in the GeminiApp Setup Instructions. Once you have these you can follow these steps:
Open the Invoice Tracker template and click ‘Copy and Customize’, then click OK on the Error Creating App: Apps Script function is invalid error
Navigate to appsheet.com and from your ‘recent’ apps open the ‘Invoice Tracker’ app
Open Automations and for the ‘New Invoices’ event under the Gmail Event Source, click Authorize, then
In the ‘Add a new data source’ enter a name e.g. Invoices Trigger, click the Gmail button and follow the authentication steps
Once complete in the AppSheet Automation settings select your Gmail account and a Label to watch
In the Process section click on the GeminiApp task and click on the blue file icon, then select your copied version of the Apps Script project and click Authorize
Once Authorize, from the Function Name select jsonControlledGeneration
Step 7: Click on the blue file icon, then select your copied version of the Apps Script project
The Power of Controlled Generation
Controlled Generation is a powerful way for extracting information from unstructured data like emails. By defining a JSON schema, I can specify exactly what information I want Gemini to extract and how it should be structured. This ensures that the output is consistent and reliable, eliminating the need for manual cleanup or post-processing.
Here’s an example of a simple JSON schema for invoice data:
const schema = {
"type": "object",
"properties": {
"Invoice Reference": {
"type": "string",
"description": "Unique identifier for the invoice"
},
"Supplier Name": {
"type": "string",
"description": "Name of the supplier"
},
"Invoice Date": {
"type": "string",
"description": "Date the invoice was issued",
"format": "date"
},
"Due Date": {
"type": "string",
"description": "Date the invoice is due",
"format": "date"
},
"Invoice Amount": {
"type": "number",
"description": "Total amount due on the invoice"
},
"Notes": {
"type": "string",
"description": "Additional notes related to the invoice",
"nullable": true
}
},
"required": ["Invoice Reference"]
}
Creating JSON Schemas with Gemini
Creating JSON schemas can seem a bit daunting at first, but Gemini can actually help you with that too! If you have sample data in a Google Sheet, you can use the Gemini Side Panel to generate a schema automatically. Just highlight the data and ask Gemini to create a JSON schema for you. You can even provide a description for each property to make your schema more understandable. Below is a prompt you can use in the Gemini Sheet Side Panel to generate a schema for your own data:
I'm using Controlled Generation with the Gemini API as described in the document https://drive.google.com/file/d/1ETKHlEUDQzJ-f2fmAzsuDjcwdt1D7R2y/view?usp=drive_link
I need help creating a JSON schema to capture data from a screen.
Could you generate a JSON schema suitable for using Controlled Generation with the Gemini API? I want to extract specific information from what's displayed on my screen.
Here are my requirements:
* **Comprehensive Schema:** The schema should be designed to capture a variety of relevant data points from the screen.
* **Detailed Descriptions:** Please include a clear and concise \`description\` for each \`property\` in the schema. This will help me understand the purpose of each field.
* **Format Specification:** If any columns contain date or datetime data, please use the \`format\` field to specify the appropriate format (e.g., "date", "date-time"). This is crucial for accurate data parsing.
* **Output Example:** Please provide the schema in the following format:
```
const schema = {
description: "Description of the data",
type: "array", // or "object", depending on the structure
items: { // If type is array
type: "object",
properties: {
propertyName: {
type: "string", // or other appropriate type
description: "Description of the property",
format: "date", // or "date-time", if applicable
nullable: false, // or true
},
// ... more properties
},
required: ["propertyName"], // If any properties are required
},
properties: { // If type is object
propertyName: {
type: "string", // or other appropriate type
description: "Description of the property",
format: "date", // or "date-time", if applicable
nullable: false, // or true
},
// ... more properties
},
required: ["propertyName"], // If any properties are required
};
```
Limitations and future developments
While the beta Gmail integration in AppSheet marks a significant new feature, it’s important to note a current limitation is support for processing email attachments. Currently, the integration focuses on metadata such as sender name, subject, and message body, but the AppSheet team have acknowledged attachment support will be added in the near future.
Looking ahead, at Google Cloud Next 2024 the AppSheet team announced an upcoming “Gemini Extract” feature, currently in private preview. This feature intends to include a native Gemini ‘controlled generation’ feature which would let app creators select the data fields they would like populated from sources including images and text. This should be a more intuitive approach to data extraction, directly integrating Gemini capabilities into AppSheet. The Next video includes a Google URL to sign up to the Gemini Extract feature https://goo.gle/appsheet-previews?r=qr
Summary
The Invoice Tracker example hopefully highlights the opportunity for streamlined solutions with data extraction with AppSheet’s Gmail integration, Gemini, and Apps Script.The GeminiApp library also simplifies the integration of Google’s Gemini AI models into Google Workspace, providing developers with tools to create sophisticated AI-powered applications.
Using the structured JSON output with Controlled Generation can help AppSheet creators by making it easier to ensure you get the data back in a suitable format including the type of data you need, such as dates. With the GeminiApp library, rapid prototyping is achievable, in the ‘Invoice Tracker’ example I was able to get a functional prototype up and running in under 30 minutes.
AppSheet’s Gmail integration, generally available to all AppSheet and Google Workspace Core users, can trigger automations directly from incoming emails without requiring app deployment. Combined with Apps Script functions this opens the door to some powerful opportunities for AppSheet creators. Integrating Gemini-powered AI extraction with AppSheet and Apps Script provides an innovative solution for automating data extraction from emails. By taking advantage of these capabilities, citizen developers can create efficient and user-friendly solutions.
Discover how to combine Imagen 3 and Google Sheets for rapid image creation. Automate your workflow and generate visuals in bulk.
Have you ever wished you could create a bunch of images without the hassle of typing in each prompt one by one? Well, Stéphane Giron has shared a clever solution using Google’s Imagen 3, Google Sheets, and Apps Script to streamline this process. While Gemini for Workspace is rolling out to more users and includes Imagen 3 for image creation in apps like Slides and the Gemini side panel, it doesn’t offer an easy way to create images in bulk. That’s where this method comes in. Instead of making images one at a time, this approach allows you to generate multiple images at once, saving you time and effort.
To get started with this method, you’ll need a Google Cloud Project and creating a service account (see the Build an AI slides reviewer with Gemini tutorial for instructions on setting this up). Once that’s set up, you create a Google Sheet with a column for your text prompts, and another for the links to the generated images. The images are automatically saved into a specific folder in your Google Drive.
The real magic happens with Apps Script. It uses Vertex AI to connect with the Imagen 3 model. The script reads each prompt from your sheet, sends it to Imagen 3 to create an image, and then stores the image in Google Drive. It also helpfully adds the image’s link back into the Google Sheet.
Check out the source post for the required code and instructions.
The GeminiApp library, designed to bring Google’s Gemini AI models into your Google Apps Script projects, has just received a major update. This new release, version 2025.01, introduces significant enhancements that expand the library’s capabilities for building sophisticated AI-powered applications within Google Workspace.
Here’s a quick look at what’s new:
Seamless Transition: The updated library allows for a smooth transition from Google AI Studio to Vertex AI.
JSON-Controlled Generation: Generate content in JSON format, either by providing a schema or allowing the model to infer it from your prompt.
Code Execution: Generate and execute code directly within prompts using gemini-2.0 models.
System Instructions: Guide the model’s behavior by providing system instructions during initialization.
Caching: Improve efficiency and reduce token usage by caching file uploads.
Easier Copy/Paste: The library now supports initialization using both new GeminiApp() and new GoogleGenerativeAI(), making it easier to copy code from Google AI Studio.
The update includes over 1.5K new lines of code, primarily in src/GeminiApp.js, src/GoogleAICacheManager.js, and tests/tests.js. Existing projects can be updated by replacing the existing GeminiApp.gs code with the updated library. This update provides a broader set of tools to create more personalized and efficient workflows.
For those new to the GeminiApp library, multiple setup options are available. Check out the updated examples and test cases in the README.md file to get started.
This update marks a step forward for developers integrating Gemini into Google Workspace. With new features and the ability to move between Google AI Studio and Vertex AI, the possibilities are vast.
Feel free to share your use cases, code improvements, and feature requests!
This report proposes a novel learning method using Gemini to automate Q&A generation, addressing the challenges of manual Q&A creation. By integrating with Google tools, this approach aims to enhance learning efficiency, accessibility, and personalization while reducing costs.
The rapid advancement of technology has offered both opportunities and challenges to the education sector. While technology can be a valuable tool for supporting teaching and learning, concerns about its appropriate use have existed for a long time.
The education sector is witnessing an increase in AI tools, each promising to enhance teaching and learning. However, the quality of these tools varies significantly. Some are better designed and can – if used appropriately – can be beneficial, while others are poorly designed. This can make it challenging for educators to find the right solutions that meet their specific needs.
This blog post by Kanshi Tanaike sheds light on the inner workings of some commercial AI tools developed for educators. It demonstrates how these tools can use generative AI to create multiple-choice questions (MCQs) on a given topic. In this particular solution, Google Forms are used in the process, with questions and answers automatically generated by the Gemini API for the user to answer in a Google Form.
For educators who have experience with Google Apps Script, this project looks like a useful starting point to refine and create your own solution or simply used to gain insights into the functioning of similar commercial tools.
Surely, you have also experienced having a presentation in which you needed to replace old content with new. Replacing text is very simple because you just need to use the Replace function, and you can either do it in the Google Slides user interface.
The problem arises when you need to replace one image with another, for example, if your corporate logo is updated to a new graphic design or if one of your favorite cloud services updates its icons (Gmail, blink blink ;-) It’s still somewhat bearable with one presentation, but what do you do when, like me, you have thousands of Google Slides files on your Google Drive?
This post explores a clever application of the Gemini API’s multimodal capabilities, created by Ivan Kutil. His code utilizes GenAI to automatically detect outdated logos within your Google Slides presentations. The original blog post (March 2024) used the Gemini Pro Vision API. As a sign of how rapidly this area evolves, Google now recommends switching to Gemini 1.5 Flash or Gemini 1.5 Pro.
Switching to the newer APIs is very straightforward as all you need to do is search the source code for models/gemini-pro-vision and replace with models/gemini-1.5-flash or models/gemini-1.5-pro.
For Apps Script projects I lean towards Gemini 1.5 Flash as it is designed for speed. I’m also always looking to opportunities to test the GeminiApp library for Apps Script and very quickly I was able to fork Ivan’s code and use Gemini 1.5 Flash using a service account.
While GeminiApp requires some initial setup, it offers significant advantages:
Easy Model Testing: Experiment with various models quickly.
Built-in Features: Includes functionalities like exponential backoff.
Google Workspace Add-on for Google Drive, which uses AI to recommend new names for the selected Doc in Google Drive by passing the body of the document within the AI prompt for context.
Tired of staring at “Untitled document” in Google Drive? At Google Cloud Next ’24 and I/O 2024, the Google Workspace DevRel team showcased new Apps Script samples utilizing the Gemini API for AI-powered functionality.
One such sample, “Name with Intelligence” by Charles Maxson, helps developers overcome this common hurdle. This Google Drive add-on leverages the Gemini API to suggest relevant titles for your documents, saving you valuable time and effort.
This innovative Google Drive add-on tackles the common struggle of naming untitled documents. It utilizes the Gemini API to analyse the content of your Google Doc and suggest relevant titles based on its understanding of the text. This not only saves you time brainstorming titles, but also ensures your documents are clearly named for easy searchability later.
For those interested in my GeminiApp library, you can also find the sample here.
Take Your Development Further
This is just a glimpse of what’s possible with the Gemini API and Apps Script. With a little creativity, you can develop your own AI-powered Google Workspace add-ons to streamline your workflows and boost your productivity.
I encourage you to experiment and create innovative solutions that enhance your Google Workspace experience!
This tutorial shows how to make a Google Chat app that answers questions based on conversations in Chat spaces with generative AI powered by Vertex AI with Gemini. The Chat app uses the Google Workspace Events API plus Pub/Sub to recognize and answer questions posted in Chat spaces in real time, even when it isn’t mentioned.
Here is another great tutorial from the Google Chat DevRel team, this time showcasing how the Google Workspace Events API and some new Google Chat UI elements can be used to turn a Chat space into a Gemini Pro powered knowledge base.
The sample solution will let you consume your Chat space message history into a Firestore database. The Chat app is an intelligent agent that can then monitor for new questions and make suggestions using Gemini to generate content based on the previous messages.
I tried to see if an AI that’s good at writing could also make smart moves in a game. I chose Battleship and set it up in Google Sheets to play against Gemini, the AI. The result was mixed. On one hand, yes, Gemini could play the game. It followed the basic rules and even managed to sink some of my ships. This was a big deal, especially since it took me a ridiculous number of days of coding to get there, and I nearly gave up at one point.
Dmitry Kostyuk has shared a blog post detailing his experiment pitting the Gemini API against himself in a game of Battleship. As explained by Dmitry while Gemini could follow the rules and even sink some ships, it needed help to avoid basic mistakes, revealing that AI still has room to grow in the realm of strategic games.
Dmitry built the game in Google Sheets and the source code is linked from the post. To guide Gemini, Kostyuk crafted detailed prompts outlining the game’s mechanics and decision-making logic. However, he encountered challenges due to Gemini’s limitations in providing strategic responses. Despite these hurdles, the project yielded valuable insights into prompting techniques for AI decision-making.
The Slides Advisor project is an open source, Google Workspace Add-On that uses artificial intelligence (AI) technology to review and give feedback on your presentations, whenever you need it. The project uses Google’s Gemini API image and text processing features to analyze your Google Slides against a set of guidelines and lets you know how you are doing against those guidelines. Even better, you can customize those guidelines to follow your organization’s recommendations, or remind you to improve your presentations based on rules you define for yourself.
We’ve all been there: scrambling to cram too much information onto a single slide. But what if you had an AI assistant to give your presentations a once-over?
The Slides Advisor is a free, open-source Google Workspace Add-on for Google Slides that uses the power of AI to analyze your presentations. It checks your slides against customisable guidelines, helping you ensure they’re clear, concise, and visually appealing.
Everything you need to get started is covered in the source post by Joe Fernandez and Steve Bazyl. One of the highlights for me is seeing how you can setup a service account to access the Gemini AI API with Google Apps Script. The post also covers how you can modify the prompt you use to get different responses from Gemini.