Check whether your URLs are Indexed

Get the index status of a list of URLs. 600 requests per month, for free. πŸš€

Check out the GIF to see it in action πŸŽ₯


How to add the Script to Google Sheets

1. Sign up to RapidAPI

Sign up to RapidAPI, an API marketplace, at https://rapidapi.com/auth/sign-up (you can even use your Google Account).


2. Subscribe to the Google Search API

Once you’re all set up on RapidAPI head to https://rapidapi.com/apigeek/api/google-search3/pricing and subscribe to the Basic Plan (Free). This plan has a hard 600 requests per month limit.

For those interested, I tested a lot of proxy scrapers but most resulted in errors and they were painfully slow.

In Google Sheets we’re unfortunately limited by Google Apps Scripts 6 minute execution time, so a proxy crawler just didn’t make sense.

The team at BuiltVisible have built an index checker using a proxy crawler in this way but using Node.js. Their script also accommodates “non-clean” URLs, such as parameterised URLs, URLs with encoded characters (i.e. non-ASCII characters) and symbols, URLs with varied letter casing and URLs with unsafe characters. Mine does not.

I tip my hat to them 🎩 πŸ”₯


3. Get your API Key

The good thing about RapidAPI is that your API key can be used universally across any of the APIs in their marketplace. To copy your API key, go to https://rapidapi.com/developer/dashboard and head down to ‘My Apps’ and then ‘Security’. From there, you can copy your Application Key.


4. Copy the script below:

The script below also includes a cool script from https://script.gs/visually-display-status-when-looping-through-google-sheets-data/ that visually displays the status of each fetch as it’s called. πŸ™Œ

function onOpen() {
  const ui = SpreadsheetApp.getUi();
  ui.createMenu('Index Checker')
    .addItem('Check URLs', 'indexchecker')
    .addToUi();
}

function indexchecker() {

  // add API key from https://rapidapi.com/apigeek/api/google-search3
  const apiKey = "";

  // get active spreadsheet
  const ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();

  // function to get last row of single column
  const getLastRowCol = (range) => {
    let rowNum = 0;
    let blank = false;
    for (let row = 0; row < range.length; row++) {

      if (range[row][0] === "" && !blank) {
        rowNum = row;
        blank = true;

      } else if (range[row][0] !== "") {
        blank = false;
      };
    };
    return rowNum;
  }

  // range variables to make data can always be appended
  const indexRange = ss.getRange('B:B').getValues();
  const indexLastRow = getLastRowCol(indexRange) + 1;
  const lastRowDiff = ss.getLastRow() - indexLastRow + 1;
  const urlRange = ss.getRange(indexLastRow, 1, lastRowDiff).getValues();


  // function to fetch SERPs from rapidapi
  const fetchSerps = (url) => {
    try {

      const options = {
        'method': 'GET',
        'contentType': 'application/json',
        'headers': {
          'x-rapidapi-key': apiKey,
          'x-rapidapi-host': 'google-search3.p.rapidapi.com'
        }
      };

      const response = UrlFetchApp.fetch(`https://google-search3.p.rapidapi.com/api/v1/search/q=site:${url}`, options);
      const content = JSON.parse(response.getContentText());

      const organicResults = content.results;
      
      // if there are no results from site query, return not indexed
      return organicResults.length === 0 ? 'Not indexed' : 'Indexed';

    } catch (e) {
      return 'No response';
    }
  }

  // loop over remaining URLs and set values while the loop is running
  urlRange.forEach(function (row, i) {
    row.forEach(function (col) {

      // visually display a fetch status. Script inspired by https://script.gs/visually-display-status-when-looping-through-google-sheets-data/
      ss.getRange(indexLastRow + i, 3).setValue("Loading...");
      
      SpreadsheetApp.flush();

      const check = fetchSerps(col);

      ss.getRange(indexLastRow + i, 2).setValue(check);
      ss.getRange(indexLastRow + i, 3).setValue("Done");

      SpreadsheetApp.flush();

    });
  });

}

5. Head over to Google Sheets

Or if you’re really smart, create a new sheet by going to: https://sheets.new

Select Script editor from the Tools menu.

Paste the script and replace line 11 with your API key:

  const apiKey = 'addapikeyhere';

Save it.


6. Run the script by heading to the ‘Index Checker’ menu and selecting ‘Check URLs’

To run the script you’ll need to make sure:

1. You have three headings: URLs (A1), Index Check (B1) and Status (C1)

2. You add all of your URLs in Column A, beneath the URLs heading.

See the gif at the beginning of the guide for more detail.

The script checks the index status of URLs by querying Google with the site: operator. eg. site:example.com/subfolder/page

It’s certainly not perfect. This is how Google describes it:

TheΒ site: operator doesn’t necessarily return all the URLs that are indexed under the prefix specified in the query. Keep this in mind if you want to use theΒ site: operator for tasks like identifying how many URLs are indexed and serving under a prefix.

Source: https://developers.google.com/search/docs/advanced/debug/search-operators/all-search-site

If you do want to accurately define the index status, you’d need to query each URL manually using the URL Inspection Tool (but you know this already, right πŸ˜‰)


Thanks for stopping by πŸ‘‹

I’m Andrew Charlton, the Google Sheets nerd behind Keywords in Sheets. πŸ€“

Questions? Get in touch with me on social or comment below πŸ‘‡


More Scripts

Submit your response

Your email address will not be published. Required fields are marked *

2 Responses on this post

  1. Andrew, amazing solution. I’ve only one question: does this work well with the captcha at Google? I’ve tried running an indexer checker manually from Google Sheets and it reported almost always an error because of the captcha.

    1. No issue with captcha πŸ™‚ The API works fine because it likely uses revolving proxies and captcha management solutions.