AppsScriptPulse

Nano Steps, Giant Leaps: Exploring On-Device AI in Chrome for Workspace Editor Add-ons

The landscape of what’s possible within the browser is quietly undergoing a significant shift, and for Google Workspace Add-on developers, this could be a game-changer. Chrome’s AI mission is simple yet powerful: to ‘make Chrome and the web smarter for all developers and all users.’ We’re seeing this vision begin to take shape with the emergence of experimental, built-in AI APIs in Google Chrome, designed to bring powerful capabilities using models like Gemini Nano directly to the user’s device.

There is a growing suite of these on-device APIs. This includes the versatile Prompt API, specialised Writing Assistance APIs (like Summarizer, Writer, and Rewriter), Translation APIs (Language Detector and Translator), and even a newly introduced Proofreader API. For many existing Workspace Add-on developers, some of the more task-specific APIs could offer a relatively straightforward way to integrate AI-powered enhancements.

However, my focus for this exploration, and the core of the accompanying demo Add-on being introduced here, is the Prompt API. What makes this API particularly compelling for me is its direct line to Gemini Nano, a model that runs locally, right within the user’s Chrome browser. This on-device approach means that, unlike solutions requiring calls to external third-party GenAI services, interactions can happen entirely client-side. The Prompt API provides web applications, including Google Workspace Editor Add-ons, with an open-ended way to harness this local AI for text-based generative tasks.

To put the Prompt API’s text processing abilities through its paces in a practical Workspace context, I’ve developed a Google Workspace Add-on focused on text-to-diagram generation. This post delves into this demonstration and discusses what on-device AI, through the versatile Prompt API, could mean for the future of Workspace Add-on development, including its emerging multimodal potential.

Why This Matters: New Horizons for Google Workspace Developers

Using an on-device LLM like Gemini Nano offers several key benefits for Workspace Add-on developers:

  • Enhanced Data Privacy & Simplified Governance:Sensitive user data doesn’t need to leave the browser, meaning no external API calls are made to third-party servers for the AI processing itself, which is a huge plus for privacy and can simplify data governance including Google Workspace Marketplace verification and Add-on data privacy policies.
  • Potential for Cost-Free GenAI (with caveats!): Client-side processing can reduce or eliminate server-side AI costs for certain tasks. Remember, “Nano” is smaller than its cloud counterparts, so it’s best for well-scoped features. This smaller size means developers should think carefully about their implementation, particularly around prompt design to achieve the desired accuracy, as the model’s capacity for understanding extremely broad or complex instructions without guidance will differ from larger models.
  • Improved User Experience & Offline Access:Expect faster interactions due to minimise network latency.

The biggest takeaway here is the opportunity to explore new avenues for GenAI capabilities in your Add-ons, albeit with the understanding that this is experimental territory and on-device models have different characteristics and capacities compared to larger, cloud-based models.

Proof of Concept: AI-Powered Text-to-Diagram Add-on

To showcase the tangible possibilities of on-device text processing, the demonstrator Add-on (available in the Nano-Prompt-AI-Demo GitHub repository) focuses on a text-to-diagram use case:

  • Users can describe a diagram in natural language (e.g., “flowchart for a login process”).
  • The Add-on then uses the Gemini Nano API via the Prompt API to convert this text description into MermaidJS diagram code.
  • It also allows users to directly edit the generated MermaidJS code, see a live preview, and utilise an AI-powered “Fix Diagram” feature if the code has errors.
  • Finally, the generated diagram can be inserted as a PNG image into their Google Workspace file.

Nano Prompt API Demo

This example illustrates how the Prompt API can be used for practical tasks within a Google Workspace environment.

Under the Bonnet: Utilising the Chrome Gemini Nano Prompt API for Text

The Add-on interacts with Gemini Nano via client-side JavaScript using the LanguageModel object in the Sidebar.html file. I should also highlight that all of the Sidebar.html code was written by the Gemini 2.5 Pro model in gemini.google.com, with my guidance which included providing the appropriate developer documentation and this explainer for the Prompt API.

The Add-on’s core logic for text-to-diagram generation includes:

  • Session Creation and Prompt Design for Gemini Nano:A LanguageModel session is created using LanguageModel.create().
  • Generating Diagrams from Text: The user’s natural language description is sent to the AI via session.prompt(textDescription).
  • AI-Powered Code Fixing: If the generated or manually entered MermaidJS code has errors, the faulty code along with the error message is sent back to the model for attempted correction.

Given that Gemini Nano is, as its name suggests, a smaller LLM, careful prompt design is key to achieving optimal results. In this demonstrator Add-on, for instance, the initialPrompts (system prompt) play a crucial role. It not only instructs the AI to act as a MermaidJS expert and to output onlyraw MermaidJS markdown, but it also includes two explicit examples of MermaidJS code within those instructions.

Providing such “few-shot” examples within the system prompt was found to significantly improve the reliability and accuracy of the generated diagram code from text descriptions. This technique helps guide the smaller model effectively.

Navigating Experimental Waters: Important Considerations (and Reassurances)

It’s important to reiterate that the majority of AI APIs are still experimental. Functionality can change, and specific Chrome versions and flags are often required. I recommend referring to official Chrome AI Documentation and Joining the Early Preview Program for the latest details and updates.

Before you go updating your popular production Google Workspace Add-ons developers should be aware of the current system prerequisites. As of this writing, these include:

  • Operating System: Windows 10 or 11; macOS 13+ (Ventura and onwards); or Linux.
  • Storage: At least 22 GB of free space on the volume that contains your Chrome profile is necessary for the model download.
  • GPU: A dedicated GPU with strictly more than 4 GB of VRAM is often a requirement for performant on-device model execution.

Currently, APIs backed by Gemini Nano do not yet support Chrome for Android, iOS, or ChromeOS. For Workspace Add-on developers, the lack of ChromeOS support is a significant consideration.

However, Google announced at I/O 2025 in the ‘Practical built-in AI with Gemini Nano in Chrome’ session that the text-only Prompt API, powered by Gemini Nano, is generally available for Chrome Extensions starting in Chrome 138. While general web page use of the Prompt API remains experimental this move hopefully signals a clear trajectory from experiment to production-ready capabilities.

Bridging the Gap: The Hybrid SDK

To address device compatibility across the ecosystem, Google has announced a Hybrid SDK. This upcoming extension to the Firebase Web SDK aims to use built-in APIs locally when available and fall back to server-side Gemini otherwise, with a developer preview planned (see https://goo.gle/hybrid-sdk-developer-preview for more information). This initiative should provide a more consistent development experience and wider reach for AI-powered features.

A Glimpse into the Future: Empowering Workspace Innovation

On-device AI opens new opportunities for privacy-centric, responsive, and cost-effective Add-on features. While the demonstrator Add-on focuses on text generation, the Prompt API and the broader suite of on-device AI tools in Chrome offer much more for developers to explore

Focusing on Unique Value for Workspace Add-ons

It’s important for developers to consider how these on-device AI capabilities—be it advanced text processing or new multimodal interactions which support audio and image inputs from Chrome 138 Canary—can be used to extend and enhance user experience in novel ways, rather than replicating core Gemini for Google Workspace features. The power lies in creating unique, value-added functionalities that complement native Workspace features.

Explore, Experiment, and Provide Feedback!

This journey into on-device AI is a collaborative one and Google Workspace developers have an opportunity to help shape on-device AI.

  1. Explore the Demo: Dive into the Nano-Prompt-AI-Demo GitHub repository to see the text-to-diagram features in action.
  2. Try It Out: Follow setup instructions to experience on-device AI with the demo, and consider exploring multimodal capabilities for your own projects by referring to the latest Early Preview Program updates.
  3. Provide Feedback: Share your experiences either about the example add-on or through the Early Preview Program.

I hope you have as much fun working with these APIs as I have and look forward to hearing how you get on. Happy Scripting!

Agent2Agent (A2A) Communication in Google Workspace with Apps Script

 

Exploring Agent2Agent (A2A) protocol implementation in Google Apps Script seamlessly allows AI agents to access Google Workspace data and functions. This could enable complex workflows and automation, overcoming platform silos for integrated AI applications.

The buzz around AI and collaborative agent systems is growing, and with it, the need for standardized communication. The recently introduced Agent2Agent (A2A) protocol aims to enable exactly that – allowing different AI agents, regardless of their underlying framework, to discover each other’s capabilities and work together. But how can the Google Workspace developer community tap into this exciting new frontier?

Enter prolific Apps Script contributor Kanshi Tanaike, who has once again provided a valuable resource for the community. Kanshi has published a detailed guide and a Google Apps Script library demonstrating how to build an Agent2Agent (A2A) server using Google Apps Script.

Why is this significant?

Kanshi’s work explores the feasibility of implementing a core A2A server component directly within the Google Workspace ecosystem. As Kanshi notes, “Such an implementation could seamlessly allow AI agents to securely access and utilize data and functionalities across Google services like Docs, Sheets, and Gmail via a standardized protocol.”. This opens up possibilities for:

  • Integrating AI agents with your Google Workspace data: Imagine AI agents that can directly interact with your Sheets, Docs, Drive files, or Gmail.
  • Building sophisticated AI-powered workflows: Automate complex tasks that span across different services, orchestrated by communicating AI agents.

Kanshi’s guide walks through the setup and implementing functions that allow AI agents to interact with Google Workspace services like Drive or external APIs via UrlFetchApp. This work significantly lowers the barrier to entry for Google Workspace developers looking to experiment with and implement A2A-compatible agents and as such I highly encourage you to check out the linked source post for the complete guide, code, and library details.

Source: Building Agent2Agent (A2A) Server with Google Apps Script

Programmatically Iterating on Images with Imagen 3, Vertex AI, and Apps Script

Use Gemini and Imagen 3 on Vertex AI to create the images you envision. Generate tailored images from reference sources with Apps Script.

Have you ever generated an image using Gemini in Google Workspace and wished you could easily tweak or iterate on it? While Gemini for Workspace is great for initial creation, iterating on those images directly isn’t currently straightforward. A recent post by Stéphane Giron highlights a programmatic approach using Google Apps Script, Vertex AI (with Imagen 3 and Gemini models) to achieve the goal of generating image variations based on a source image.

Stéphane’s method uses a source image (which could be one previously generated or found elsewhere) which is provided to the Vertex AI API for Gemini (e.g., gemini-2.0-pro) along with text instructions for the required changes. The Gemini model analyzes the image and the request to generate a new prompt. This new prompt is then used with the Imagen 3 model (e.g., imagen-3.0-generate-001) via Vertex AI to generate the final image variation.

It’s interesting to contrast this with another solution we’ve previously featured from Kanshi Tanaike on ‘Iterative Image Generation with the Gemini API and Google Apps Script‘. While Tanaike’s method uses the Gemini API’s chat history to iteratively build an image from sequential text prompts, Stéphane’s focuses on reinterpreting a source image with specific modifications via a newly generated prompt.

You can check out Stéphane Giron’s source post for the complete code and setup instructions.

Source: Similar Image Generation with Imagen 3 on Vertex AI with Google Apps Script

Google Workspace Flows: A Developer’s First Look After Cloud Next ’25

Image credit: Google

Last week at Google Cloud Next ’25 was packed with announcements, but one that particularly grabbed my attention was the unveiling of Google Workspace Flows. As Google Apps Script developers, many of us are familiar with automating simple tasks. However, the Flows demo hinted at a more accessible approach for tackling complex business processes.

Think about those workflows that go beyond basic if-this-then-that and into a world where you can easily configure Gemini to be your virtual assistant. Updating specific spreadsheet entries based on nuanced analysis, or finding and summarising information scattered across different files before replying to a customer. Traditional automation often hits a wall here because these tasks require context, reasoning, and sometimes even creative generation – capabilities standard automation tools lack.

Workspace Flows: AI Agents Joining the Workflow

What Google presented with Flows is a new solution for Google Workspace designed to automate these kinds of multi-step processes with AI providing assistance. Instead of just triggering actions, Flows uses AI models, including Gemini, as agents within the loop. This means the AI isn’t just kicking off a process; it’s actively participating – researching, analysing, creating, and reasoning to help get work done more efficiently and intelligently.

Having used tools like IFTTT it feels like a conceptual shift from simple automation to building intelligent, agentic workflows directly within the Workspace environment, without writing a single line of code.

Image credit: Google

If you would like to see Flows in action I highly recommend you check out the Google Cloud Next session ‘New ways to automate your work and integrate with Google Workspace’.

Why Flows Looks Interesting for Developers: Current Capabilities and Future Vision

Beyond the core AI capabilities shown in the demo, Google outlined a vision for Flows that is particularly relevant for developers, indicating where the platform is heading:

  • No-Code/Low-Code Interface (Current): The initial preview allows configuring triggers (like new emails, form responses, etc.) and actions across core Workspace apps and integrating Gemini and Gems for AI-driven steps, removing the need to find alternative solutions (code you write or third party solutions).
  • Apps Script Extensibility (Future Vision): Google announced plans to allow developers to build their own custom triggers and actions using Apps Script. This creates the opportunity to integrate your own systems or add specific logic to get more out of Flows. The presentation briefly showed an example appsscript.json manifest snippet for declaring these elements around 13:08).
  • Workspace Connectors Platform (Future Vision): A dedicated platform for third-party integrations was also announced in the presentation as part of the roadmap. The plan is to enable connections to tools like Jira, Asana, Salesforce, HubSpot, etc., allowing them to be used as triggers or actions. The stated goal is to include the ability to build end-to-end workflows spanning beyond Google Workspace, with connectors built once potentially working in both Flows and Gemini Extensions.
  • Bring Your Own Models via Vertex AI (Future Vision): For advanced AI needs, Google shared the vision for integrating your own custom or fine-tuned models hosted on Google Cloud’s Vertex AI. The concept shown involved an ‘Ask an LLM’ step where you could select a ‘Custom Model’ directly within the flow builder, pointing towards future capabilities for incorporating highly specialized AI into Workspace automations.

Looking Ahead

Google Workspace Flows is definitely a platform I’ll be watching closely. The initial preview, focusing on AI agents-in-the-loop and core Workspace automation, is already compelling. But the announced roadmap for developer extensibility – adding Apps Script support, a robust connector platform, and the ability to call custom Vertex AI models – is what makes Flows truly exciting from a development perspective.

Flows is currently in alpha with its initial feature set. If this vision sounds as interesting to you as it did to me, you can sign up for the early access waitlist.

New Feature: Get Apps Script Code Directly from Google AI Studio

Getting started with generative AI in your Google Apps Script projects just got a whole lot easier! Google AI Studio has introduced a handy new feature allowing you to directly export your AI interactions as ready-to-use Apps Script code. If you’re new to Apps Script or integrating AI, this is a fantastic way to quickly add powerful features to your automations. Here’s how you can grab the code:

  1. Click the Get code icon (</>) above the chat prompt.
  2. In the ‘Get code’ window, click the language dropdown (this might initially show ‘Python’).
  3. Select Apps Script from the dropdown list.
  4. Click the Copy button to copy the generated Apps Script code to your clipboard.

If you are using AI Studio with your enterprise data, make sure you’re using a billable account so that your data is protected. This addition is perfect for rapid prototyping and understanding the basic API interaction. However, for applications needing more features essential for production environments, including robust error handling like exponential back-off, then you might want to look at my open source GeminiApp library.

Iterative Image Generation with the Gemini API and Google Apps Script

Image credit: Kanshi Tanaike

Gemini API now generates images via Flash Experimental and Imagen 3. This report introduces image evolution within conversations using Gemini API with Google Apps Script.

The Gemini API recently gained the ability to generate images. Taking this a step further, Kanshi Tanaike has explored how to create evolving images within a conversation using Google Apps Script.

Often, you might want to generate an image and then iteratively add or modify elements in subsequent steps. Kanshi’s approach cleverly uses the chat functionality of the Gemini API (gemini-2.0-flash-exp model). By sending prompts sequentially within a chat, the API uses the conversation history, allowing each new image to build upon the previous one. This enables the generation of images that evolve step-by-step based on your prompts, as demonstrated in the original post with examples like drawing successive items on a whiteboard.

This technique is particularly useful because, as noted in the post, using chat history provides better results for this kind of sequential image generation compared to generating images from isolated prompts.

Kanshi Tanaike’s original post includes a detailed explanation, setup instructions (including API key usage and library installation ), and complete sample code snippets that you can adapt for your own Google Workspace projects.

Source: Generate Growing Images using Gemini API

Unlock Your Productivity Potential with Gemini and Google Calendar (and other Workspace APIs)


As a Google Workspace Developer Advocate, I’m always exploring innovative ways to leverage technology. I’m thrilled to share my new Jupyter Notebook that showcases the power of Gemini and the Google Calendar API for productivity coaching. 🚀✨

This notebook dives into:
– Using Gemini’s multimodal capabilities to analyze calendar data. 📅
– Leveraging function calling to connect Gemini with Google Workspace APIs. 🔗
– Developing a personalized AI productivity coach. 🧑‍🏫

It’s amazing to see how generative AI can transform the way we work and optimize our time. ⏳

Explore the notebook on GitHub and discover how to build your own AI-powered productivity tools! 🛠️

👉 Link: https://lnkd.in/gpTxMMAy

Are you struggling to manage your time effectively? Mohammad Al-Ansari, Google Developer Advocate, has recently shared how the Gemini API can act as a personal productivity coach. The solution uses a Google Colab notebook, which is connected to the Google Calendar API. The Gemini API is used to provide personalized insights and recommendations to boost your productivity and improve your work/life balance. For those who don’t know, Google Colab is a free cloud-based platform for running Jupyter Notebooks, which are interactive coding environments that allow users to write and execute code and can be a useful tool to have in the toolbox when exploring data.

Some of the key features in Mohammad’s ‘Productivity Coach’ are:

  • Function Calling: This notebook uses Gemini’s ‘function calling’ capabilities, allowing it to dynamically interact with the Google Calendar API and retrieve real-time data. This ensures that the analysis and recommendations are always up-to-date and relevant to your current schedule.
  • Google Workspace Integration: By integrating with the Google Calendar API, this notebook shows how the Gemini API can be seamlessly integrated with other Google Workspace services. This opens up exciting possibilities for quickly experimenting with other Workspace data sources, such as Google Docs, Sheets or Drive.
  • Personalized Coaching: As a bonus you can see how Gemini can act as your personal productivity coach, so if nothing else you can get some tailored guidance and support about your own calendar

Check out the notebook to explore!

Source: generative-ai/gemini/use-cases/productivity/productivity_coaching_with_google_calendar.ipynb at main · GoogleCloudPlatform/generative-ai

‘AI Agents’ in Google Apps Script: Automate Google Workspace with Natural Language

Imagine that you write in plain English what you want to do in Google Workspace (eg. workflows) and it happens just like magic. Insert text prompt, Gemini will generate the code for you and run it immediately. A dream? No, reality, thanks to my conceptual and practical idea of how to implement AI Agents in Google Apps Scripts to leverage the V8 runtime.

Ivan Kutil has explored the concept of AI Agents in Google Apps Script, enabling Google Workspace automation via plain English descriptions. Users describe their automation needs in natural language, which is then processed by Gemini API to generate the necessary code. The generated code is then executed in your Google Apps Script project.

Ivan’s solution uses the gemini-2.0-flash-thinking-exp-01-21 model, an experimental model within Vertex AI specifically designed to reveal its ‘thinking process’, resulting in more reliable code generation. The enhanced reasoning capabilities of this model are particularly beneficial for complex automation tasks, making it a powerful tool for Google Workspace customisation.

To ensure that the agent is doing the right thing, the clever bit is you can test the execution via a dry-run, where the code created with Gemini Flash Thinking is sent to an internal ‘Tester’ agent, which uses Gemini to comment on the code and summarises it in a log. It’s important to review the script before running it, as Ivan accepts no responsibility for the results of the script. Another nice feature is the generated code is stored in the Cache, so after running a dry-run and then a run, the same version will be executed within the Cache limit (currently set to 5 minutes).

This solution, which mirrors Gemini for Workspace’s ability to generate and execute basic Python code, suggests a future where Gemini for Workspace could write and execute Apps Script code for basic tasks. This has the potential to transform how users interact with and automate their Google Workspace environments.

If you are interested in a version of Ivan’s solution that incorporates my GeminiApp library, follow this link. For additional information on Ivan’s solution including setup instructions follow the source link.

Source: Create AI agents in Google Apps Script with Vertex AI and Gemini

How Apps Script Became the Ultimate LLM Fine-Tuning Tool

If you have domain-specific knowledge that you want an LLM to leverage, you probably have a use case for fine-tuning. Fine-tuning can significantly improve how well the model understands and responds to your queries, whether it’s legal documents, medical texts, financial reports, or niche industry data.

The most crucial step in this process is structuring your data correctly. If your dataset is well-organized and formatted properly, the rest of the workflow becomes much more manageable. From there, it’s just a matter of setting up a few configurations and automating parts of the process with Apps Script. That’s where things get interesting and surprisingly efficient.

Source: How Apps Script Became the Ultimate LLM Fine-Tuning Tool

Automating Data Extraction with the Gemini API Controlled Generation and AppSheet’s New Gmail Integration

Tired of manually processing invoices? I recently built a demo that automates this tedious task using the power of Gemini, AppSheet’s new Gmail integration, and a custom Apps Script library. Let me show you how it works!

Here’s the setup:

  • Gmail Integration: AppSheet’s new Gmail integration allows you to trigger automations based on incoming emails. I’ve set it up to monitor a specific Gmail label called “Invoices”.
  • Apps Script Library: Using my “GeminiApp” (available on Github) which simplifies the interaction with Google’s Gemini AI models directly from Apps Script. This library handles the heavy lifting of making API calls and managing responses.
  • Controlled Generation: Gemini’s “Controlled Generation” feature lets me define a JSON schema that structures the AI’s output. This is key for extracting invoice data in a consistent, machine-readable format.

The Workflow:

  1. Invoice Received: When an invoice email arrives and is labelled “Invoices”, AppSheet’s Gmail integration kicks in.
  2. Automation Triggered: AppSheet triggers an automation that calls a custom Apps Script function called jsonControlledGeneration.
  3. Data Extraction: The jsonControlledGeneration function uses the GeminiApp library to send the email body to Gemini with a predefined JSON schema for invoice data.
  4. Structured Output: Gemini processes the email content and extracts the relevant invoice details (e.g., invoice number, supplier name, date, amount) in a JSON format that adheres to the schema.
  5. Downstream Processing: The structured JSON output can then be easily returned to the AppSheet automation for further actions, such as automatically populating your data table, updating a database, or triggering a payment process.

If you would like to try this yourself, instructions are provided below.

Want to try it yourself?

To set this demo up you will either need a Google AI Studio API key or a Google Cloud project with Vertex AI enabled. Information on both these setups is included in the GeminiApp Setup Instructions. Once you have these you can follow these steps:

  1. Open the GeminiApp Controlled Generation Google Apps Script project and click ‘Make a copy’ from the Overview page
  2. Open the Invoice Tracker template and click ‘Copy and Customize’, then click OK on the Error Creating App: Apps Script function is invalid error
  3. Navigate to appsheet.com and from your ‘recent’ apps  open the ‘Invoice Tracker’ app
  4. Open Automations and for the ‘New Invoices’ event under the Gmail Event Source, click Authorize, then
  5. In the ‘Add a new data source’ enter a name e.g. Invoices Trigger, click the Gmail button and follow the authentication steps
  6. Once complete in the AppSheet Automation settings select your Gmail account and a Label to watch
  7. In the Process section click on the GeminiApp task and click on the blue file icon, then select your copied version of the Apps Script project and click Authorize
  8. Once Authorize, from the Function Name select jsonControlledGeneration

To test this app, you can copy and send this example invoice.

Step 7: Click on the blue file icon, then select your copied version of the Apps Script project

The Power of Controlled Generation

Controlled Generation is a powerful way for extracting information from unstructured data like emails. By defining a JSON schema, I can specify exactly what information I want Gemini to extract and how it should be structured. This ensures that the output is consistent and reliable, eliminating the need for manual cleanup or post-processing.

Here’s an example of a simple JSON schema for invoice data:

 const schema = {
    "type": "object",
    "properties": {
      "Invoice Reference": {
        "type": "string",
        "description": "Unique identifier for the invoice"
      },
      "Supplier Name": {
        "type": "string",
        "description": "Name of the supplier"
      },
      "Invoice Date": {
        "type": "string",
        "description": "Date the invoice was issued",
        "format": "date"
      },
      "Due Date": {
        "type": "string",
        "description": "Date the invoice is due",
        "format": "date"
      },
      "Invoice Amount": {
        "type": "number",
        "description": "Total amount due on the invoice"
      },
      "Notes": {
        "type": "string",
        "description": "Additional notes related to the invoice",
        "nullable": true
      }
    },
    "required": ["Invoice Reference"]
  }

Creating JSON Schemas with Gemini

Creating JSON schemas can seem a bit daunting at first, but Gemini can actually help you with that too! If you have sample data in a Google Sheet, you can use the Gemini Side Panel to generate a schema automatically. Just highlight the data and ask Gemini to create a JSON schema for you. You can even provide a description for each property to make your schema more understandable. Below is a prompt you can use in the Gemini Sheet Side Panel to generate a schema for your own data:

I'm using Controlled Generation with the Gemini API as described in the document https://drive.google.com/file/d/1ETKHlEUDQzJ-f2fmAzsuDjcwdt1D7R2y/view?usp=drive_link

I need help creating a JSON schema to capture data from a screen.

Could you generate a JSON schema suitable for using Controlled Generation with the Gemini API? I want to extract specific information from what's displayed on my screen.

Here are my requirements:

* **Comprehensive Schema:** The schema should be designed to capture a variety of relevant data points from the screen.  
* **Detailed Descriptions:** Please include a clear and concise \`description\` for each \`property\` in the schema. This will help me understand the purpose of each field.  
* **Format Specification:** If any columns contain date or datetime data, please use the \`format\` field to specify the appropriate format (e.g., "date", "date-time"). This is crucial for accurate data parsing.  
* **Output Example:** Please provide the schema in the following format:

```
const schema = {
  description: "Description of the data",
  type: "array", // or "object", depending on the structure
  items: { // If type is array
    type: "object",
    properties: {
      propertyName: {
        type: "string", // or other appropriate type
        description: "Description of the property",
        format: "date", // or "date-time", if applicable
        nullable: false, // or true
      },
      // ... more properties
    },
    required: ["propertyName"], // If any properties are required
  },
  properties: { // If type is object
      propertyName: {
        type: "string", // or other appropriate type
        description: "Description of the property",
        format: "date", // or "date-time", if applicable
        nullable: false, // or true
      },
      // ... more properties
    },
    required: ["propertyName"], // If any properties are required
};
```

Limitations and future developments

While the beta Gmail integration in AppSheet marks a significant new feature, it’s important to note a current limitation is support for processing email attachments. Currently, the integration focuses on metadata such as sender name, subject, and message body, but the AppSheet team have acknowledged attachment support will be added in the near future.

Looking ahead, at Google Cloud Next 2024 the AppSheet team announced an upcoming “Gemini Extract” feature, currently in private preview. This feature intends to include a native Gemini ‘controlled generation’ feature which would let app creators select the data fields they would like populated from sources including images and text. This should be a more intuitive approach to data extraction, directly integrating Gemini capabilities into AppSheet. The Next video includes a Google URL to sign up to the Gemini Extract feature https://goo.gle/appsheet-previews?r=qr

Summary

The Invoice Tracker example hopefully highlights the opportunity for streamlined solutions with data extraction with AppSheet’s Gmail integration, Gemini, and Apps Script.The GeminiApp library also  simplifies the integration of Google’s Gemini AI models into Google Workspace, providing developers with tools to create sophisticated AI-powered applications.

Using the structured JSON output with Controlled Generation can help AppSheet creators by making it easier to ensure you get the data back in a suitable format including the type of data you need, such as dates. With the GeminiApp library, rapid prototyping is achievable, in the ‘Invoice Tracker’ example I was able to get a functional prototype up and running in under 30 minutes.

AppSheet’s Gmail integration, generally available to all AppSheet and Google Workspace Core users, can trigger automations directly from incoming emails without requiring app deployment. Combined with Apps Script functions this opens the door to some powerful opportunities for AppSheet creators. Integrating Gemini-powered AI extraction with AppSheet and Apps Script provides an innovative solution for automating data extraction from emails. By taking advantage of these capabilities, citizen developers can create efficient and user-friendly solutions.