Gemini in Google Sheets: Automating Messy Data Cleanup Without Formulas

May 19, 2026 8 min read 38 views
Flat illustration of a digital spreadsheet with messy data cells being organized into clean structured rows on a soft blue background

You open a spreadsheet someone exported from the CRM and it looks like a crime scene. Names in ALL CAPS, phone numbers in four different formats, addresses split across one column or crammed into three, and a "Status" column where someone typed "active", "Active", "ACTIVE", and "yes" interchangeably. Cleaning this by hand takes hours. Writing formulas for every edge case takes almost as long.

Gemini, built directly into Google Sheets, gives you a faster path. You describe what you want in plain English, and it generates the transformation β€” or in many cases applies it outright. Here is how to actually use it on real-world messy data.

What you'll learn

  • How to access and prompt Gemini from inside Google Sheets
  • How to standardize inconsistent text, dates, and phone numbers
  • How to extract structured values from freeform cells
  • How to bulk-apply AI-suggested formulas across thousands of rows
  • Common mistakes that make Gemini less useful and how to avoid them

Prerequisites

You need a Google account with access to Google Workspace (personal Google accounts with Gemini access also qualify). The Gemini sidebar in Sheets is available on Workspace plans that include Gemini, or through the Google One AI Premium plan. Make sure you are using Google Chrome or a Chromium-based browser for the best experience. No coding knowledge is required.

Opening Gemini in Google Sheets

Open any Google Sheet and look for the Ask Gemini button in the top-right corner of the toolbar β€” it looks like a small star or sparkle icon. Clicking it opens a sidebar panel where you can type natural-language instructions.

Alternatively, go to Extensions > Gemini if your Workspace admin has deployed it through that path. The sidebar stays open while you work, so you can keep asking follow-up questions as you refine the output.

One important thing to know upfront: Gemini can either generate a formula you paste into a column, or it can suggest an action it applies directly. Which mode it uses depends on how you phrase the request. Both are useful, and you will use both in a real cleanup workflow.

Standardizing Inconsistent Text Values

The classic problem: a "Status" column with a dozen variations of the same three values. You want every cell to read exactly "Active", "Inactive", or "Pending" β€” nothing else.

Select the column, open the Gemini sidebar, and type something like:

Column C contains status values entered inconsistently β€” things like "active", "ACTIVE", "yes", "inactive", "no", "pending", "pndng". Standardize everything to one of three values: Active, Inactive, or Pending. Map "yes" to Active and "no" to Inactive. Write a formula I can put in a helper column.

Gemini will produce a nested IF or IFS formula using LOWER() and string matching. You drop it in column D, verify a sample of rows, then paste-as-values back into column C and delete the helper column. The whole process takes under two minutes regardless of row count.

For name casing β€” all-caps or all-lowercase names β€” the prompt is even simpler:

Column A has names in inconsistent casing. Write a formula to convert them to proper case (first letter of each word capitalized).

Gemini will give you =PROPER(A2) along with a note about edge cases like "McDonald" or "O'Brien" where PROPER falls short. That caveat is genuinely useful β€” it tells you which rows to review manually.

Cleaning and Formatting Phone Numbers

Phone number columns are almost always a mess in exported data. You get entries like 555-123-4567, (555) 123 4567, 5551234567, +1-555-123-4567, and the occasional 555.123.4567. Your target format is (555) 123-4567.

Try this prompt:

Column B contains US phone numbers in mixed formats. Strip all non-digit characters, then reformat to (XXX) XXX-XXXX. Write a formula for a helper column.

Gemini typically returns something combining REGEXREPLACE to strip non-digits and then string slicing with MID, LEFT, and RIGHT to rebuild the format. Here is what the output often looks like:

=TEXT(REGEXREPLACE(B2,"[^0-9]",""),"(###) ###-####")

This works for 10-digit numbers. For numbers that come in with a leading country code (+1), Gemini will usually handle that too if you mention it explicitly in the prompt. Specificity in your prompts saves you a second round of fixes.

Extracting Values from Freeform Text

This is where Gemini earns its keep. Say you have an "Address" column with entries like 123 Main St, Springfield, IL 62701 and you need city and state in separate columns. Or a "Notes" field with sentences like "Follow up by March 15 β€” client interested in enterprise plan" and you need to pull out the date.

For the address split:

Column D contains full US addresses in the format "Street, City, State ZIP". Write formulas to extract just the city into column E and the two-letter state code into column F.

Gemini will usually reach for REGEXEXTRACT or a combination of SPLIT and INDEX. The formula it generates is often something a mid-level Sheets user would spend 20 minutes figuring out. You get it in seconds.

For pulling dates out of freeform notes:

Column G has free-text notes. Some cells contain a date written as a month name followed by a day and year, like "March 15" or "April 3, 2024". Write a formula to extract the date if one exists, or return blank if there is none.

Gemini may acknowledge this is tricky with formulas alone and suggest using Apps Script instead. That is a honest, useful answer. When the task genuinely needs scripting, it will tell you β€” and sometimes it will write the Apps Script too.

Using Apps Script for Bulk Transformations

Some cleanup tasks are too complex for a single formula. Gemini can write Google Apps Script that runs directly in your spreadsheet without you needing to know JavaScript deeply.

Open the Gemini sidebar and ask:

Write a Google Apps Script function that loops through column A from row 2 to the last row, trims whitespace from each cell, removes any trailing periods, and converts the text to title case. Add a menu item called "Clean Names" that runs this function.

Gemini produces a complete script. You copy it, open Extensions > Apps Script, paste it into the editor, and click Save. Reload the sheet and you will see a "Clean Names" menu item appear. Click it and it runs on the whole column in one pass.

function cleanNames() {
  const sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
  const lastRow = sheet.getLastRow();
  const range = sheet.getRange(2, 1, lastRow - 1, 1);
  const values = range.getValues();

  const cleaned = values.map(([name]) => {
    if (!name) return [""];
    return [
      name.trim()
        .replace(/\.$/, "")
        .toLowerCase()
        .replace(/\b\w/g, c => c.toUpperCase())
    ];
  });

  range.setValues(cleaned);
}

function onOpen() {
  SpreadsheetApp.getUi()
    .createMenu("Cleanup Tools")
    .addItem("Clean Names", "cleanNames")
    .addToUi();
}

Always review the script before running it. Check that it targets the correct column and sheet, and test it on a copy of your data first. Gemini-generated Apps Script is usually correct but not infallible.

Deduplicating and Flagging Duplicates

Removing duplicates is a built-in Sheets feature under Data > Data cleanup > Remove duplicates. But flagging potential duplicates β€” where names are slightly different spellings of the same person β€” is harder. Gemini can help you build a formula-based approach.

I have a list of company names in column A. Some entries are the same company but spelled slightly differently β€” like "Acme Corp" and "Acme Corporation". Write a formula that checks each name against all other names and flags it if a similar name exists elsewhere in the column.

Gemini will likely recommend a fuzzy-match approach using COUNTIF with wildcards for simple cases, and flag that true fuzzy matching in Sheets requires Apps Script or an external tool. It may suggest checking for partial substring matches as a first pass, which catches a large percentage of duplicates with minimal effort.

Common Pitfalls to Avoid

Vague prompts produce vague results

"Clean up column B" tells Gemini almost nothing. The more context you give β€” what the current format looks like, what the target format is, and any edge cases you already know about β€” the more accurate the output. Paste one or two example values directly in your prompt.

Applying output without checking it

Gemini's formula suggestions are a starting point, not a guaranteed solution. Always test on a small range before applying to the full dataset. Pay attention to rows with blank cells, unusual characters, or values that fall outside the patterns you described.

Overwriting your original data too early

Do the transformation in a helper column first. Once you have verified the output looks correct on a representative sample, paste-as-values back to the source column. Keep a backup tab of the original data until you are confident the cleanup is complete.

Ignoring the Apps Script option

Some people try to force a formula-only solution on a task that really needs a script. If Gemini suggests Apps Script, take that suggestion seriously. A 20-line script that runs in two seconds is more reliable than a 200-character nested formula that breaks on edge cases.

Not specifying locale

Date and number formats vary by locale. If your sheet has a mix of US and European date formats, tell Gemini that explicitly. Otherwise the formula it generates may interpret 04/05/2024 as April 5th when you need May 4th.

Wrapping Up

Gemini in Google Sheets is most useful as a pair-programmer for data transformation β€” it handles the formula-writing and scripting work so you can focus on verifying the output makes sense. Here are your next steps:

  • Pick one messy column in a real dataset and run a Gemini prompt on it today. Starting small builds intuition fast.
  • Save your best prompts in a notes doc. A prompt that worked for phone number standardization will work again next month on a different export.
  • Explore Apps Script output for any task Gemini says is too complex for a formula. Read through the generated code β€” even a basic understanding of what it does makes you more confident running it.
  • Keep a backup tab before any bulk transformation. Sheets version history helps, but an explicit copy costs nothing and saves frustration.
  • Combine Gemini with native Sheets features β€” use Data Cleanup to remove obvious duplicates first, then use Gemini for the nuanced transformations that built-in tools cannot handle.

πŸ“€ Share this article

Sign in to save

Comments (0)

No comments yet. Be the first!

Leave a Comment

Sign in to comment with your profile.

πŸ“¬ Weekly Newsletter

Stay ahead of the curve

Get the best programming tutorials, data analytics tips, and tool reviews delivered to your inbox every week.

No spam. Unsubscribe anytime.