Nine years ago, I wrote an article describing how to scrape data from the internet in about five minutes. It featured a Google Apps Script library that allowed you to specify what to download from a webpage by identifying the text surrounding your target information. This became one of my most-read articles, and the library itself saved me a significant amount of time.
With the advent of large language models like Gemini, this entire paradigm is shifting. Just as “vibe coding” describes a more intuitive approach to programming, I’d say we’re now entering an era of “vibe-scraping.”
Using data from the web has long been a staple. But what if you could just describe the information you want and have an AI fetch it for you?
Google Developer Expert, Ivan Kutil, introduces a modern approach he calls “vibe scraping.” In his latest post, he demonstrates a powerful method that combines Google Apps Script with the Gemini API to pull structured data from any webpage directly into a Google Sheet.
A Practical Example: From Movie Links to a Full Database
To understand the power of this technique, Ivan set out to solve an old problem. He had a Google Sheet with a list of movies, each with a URL pointing to its page on ČSFD (a popular Czech-Slovak film database). His goal was to automatically populate columns with details like genre and rating by extracting the data directly from those linked pages.
At the heart of his solution is a new, experimental Gemini API feature called the URL Context tool. As the documentation notes, this tool is surprisingly flexible: it “can then retrieve content from the URLs and use that content to inform and shape its response.”
The tool can be used in two main ways:
URL Context “By Itself”: In this mode, the tool directly fetches and processes the content from the URL you provide. This is a straightforward way to analyze the live content of a single page.
URL Context with Google Search Grounding: In this mode, the tool can read content from a URL and then use Google Search for a more in-depth understanding.
What makes Ivan’s script so elegant is how it combines this tool with the simplicity of Google Sheets. For each row, the script constructs a prompt for the Gemini API, asking it to analyse the content of the provided URL and find the specific information defined in the spreadsheet headers.
The result is a system where modifying your data query is as easy as renaming a column. This turns a potentially complex scraping task into a simple spreadsheet edit, allowing anyone to create powerful, custom data importers.
To get the full details and the code to build this yourself, check out Ivan’s original post.