137Foundry

Posted on May 12

How to Read Google Search Console Crawl Stats to Debug Indexing Problems

#seo #webdev #productivity

If a page on your site isn't showing up in Google Search, the first question is whether Googlebot has even tried to crawl it. The second question is whether the crawl attempt succeeded. Google Search Console's Crawl Stats report answers both questions, at least at the aggregate level, and it's one of the most underused diagnostic tools in technical SEO.

This guide walks through how to navigate to the report, what each section shows, how to read the data to identify specific problems, and what to do when you find them.

Step 1: Navigate to the Crawl Stats Report

Open Google Search Console and select your property. In the left sidebar, go to Settings. Under "Crawling," you'll find the Crawl Stats report. Click "Open Report" to view the full data.

The report shows data for the last 90 days. It updates daily, so changes you make to your site (new robots.txt rules, sitemap updates, redirect fixes) will start appearing in the data within a few days, though their full effect may take weeks to show.

Step 2: Read the Crawl Requests Graph

The top of the report shows a graph of total crawl requests per day. This is the first number to understand: how many pages is Googlebot crawling on your site each day, and is that number changing over time?

For most sites, the trend should be relatively stable, with spikes around major content publishing events and gradual increases as the site grows. Patterns that indicate problems:

Sharp drops in crawl volume often mean Googlebot was blocked -- by a robots.txt change, a server configuration error, or a DNS issue. If you see a cliff in the graph, check your server logs and robots.txt for changes around that date.

Sustained high crawl volume relative to page count often means Googlebot is crawling a large number of low-value URLs. If your site has 10,000 pages but Googlebot is making 50,000 requests per day, it's spending most of its time on URL variants, redirects, or URLs that aren't in your sitemap.

A declining trend over time can indicate that Googlebot is finding fewer new or updated pages and is reducing its crawl frequency. This is sometimes appropriate (a stable site that doesn't publish often), but can also indicate that content isn't being discovered because it's buried in the site structure.

Step 3: Analyze the Response Code Breakdown

Below the crawl volume graph, Crawl Stats shows a breakdown by response code. This is where the diagnostic detail lives.

2xx (Success): Pages that returned a successful response. The proportion of your total crawl requests that return 2xx should be high -- ideally above 90%. If it's significantly lower, other response codes are consuming crawl budget.

3xx (Redirects): Pages that returned a redirect. A small percentage of redirects is expected, but a high percentage indicates redirect chains. Every redirect response is a crawl request that didn't result in a content crawl. If 20% of your daily crawl requests return redirects, Googlebot is spending a fifth of its crawl budget just following pointers to other URLs.

4xx (Client Errors): Pages that returned 404 or similar errors. A small number of 404s is normal -- Googlebot follows external links to your site that may point to deleted pages. A large number suggests dead internal links or a sitemap that hasn't been cleaned up after page deletions.

5xx (Server Errors): Pages that returned server errors. Any sustained volume of 5xx responses is a problem. It means pages that should be accessible are failing, Googlebot is wasting crawl budget on them, and the errors may be reducing Googlebot's crawl rate for your domain.

Step 4: Check the Response Time Data

The report also shows average response time for crawled pages. Faster response times allow Googlebot to crawl more pages per session. Google's crawl budget documentation specifically notes that slow response times reduce crawl rate.

Watch for response time spikes that correlate with drops in crawl volume -- this is a direct signal that server performance is affecting how aggressively Googlebot crawls your site. Sustained response times above 500ms are worth investigating; anything above 1000ms is likely affecting your crawl rate.

Step 5: Review the File Type Breakdown

The file type section shows what types of resources Googlebot is crawling: HTML pages, images, JavaScript files, CSS files, and so on. For crawl budget purposes, you care most about HTML, but the breakdown tells you if Googlebot is spending requests on resource files.

If Googlebot is making thousands of requests for JavaScript or CSS files, this typically means those files aren't cached properly or are being changed frequently. Googlebot needs to periodically recrawl resource files to render JavaScript-heavy pages, so some of this is expected, but unusually high resource file crawl volume is worth checking.

Step 6: Connect Crawl Stats to Indexing Gaps

Crawl Stats tells you what Googlebot crawled. The Coverage report (under Indexing in the left sidebar of Google Search Console) tells you what pages are actually in Google's index.

Compare the two. If Crawl Stats shows high daily crawl volume but Coverage shows fewer indexed pages than expected, one of these things is happening:

Many of the crawled URLs are redirects, errors, or noindexed pages (which Googlebot crawls but doesn't index)
Googlebot is crawling a large number of URL variants and only indexing the canonical version
Pages are being crawled but failing Googlebot's quality threshold for indexing

Cross-reference Coverage's "Excluded" section with Crawl Stats response codes. If you see high redirect volume in Crawl Stats and a large "Excluded -- Redirect" count in Coverage, your redirect chains are definitively the problem.

Step 7: Look for Timing Patterns

Scroll down to the "How Googlebot crawled your site" section if available, or look at the daily data points in the main graph. For sites with regular content publishing schedules, there should be crawl spikes after publication days -- Googlebot notices new sitemaps and internal links and crawls more aggressively.

If you're publishing content but not seeing corresponding crawl spikes, it can mean Googlebot's crawl priority for your domain is low because of quality signals elsewhere on the site. Too much low-value crawl content trains Googlebot to be less aggressive about discovering new pages, because previous sessions didn't yield much indexable content.

Step 8: Diagnose Specific Problems

Based on what the data shows, here's how to connect Crawl Stats patterns to specific fixes:

Pattern	Likely Cause	Fix
High 3xx volume	Redirect chains, old URLs in sitemap	Update redirects to go direct; clean up sitemap
High 4xx volume	Dead internal links, deleted pages in sitemap	Audit internal links; remove 404s from sitemap
High crawl volume, low index count	Parameterized URLs, duplicate content	Add robots.txt rules; add canonical tags
Declining crawl volume	Crawl rate throttled due to slow server or low content quality	Improve response times; reduce low-value URL crawlable pool
Crawl spikes at wrong times	External links from high-traffic sources	Expected behavior, not a problem

Step 9: Track Changes Over Time

After making crawl budget fixes -- updating robots.txt, cleaning the sitemap, fixing redirect chains -- use Crawl Stats as your measurement tool. The changes won't be instant. Give it four to six weeks to see whether the response code ratios shift and whether total crawl volume adjusts.

The metric to focus on is crawl quality, not just crawl volume. Lower total crawl volume with a higher percentage of 2xx responses and faster response times means Googlebot is spending its budget more efficiently. That's the goal.

A deeper explanation of what causes crawl budget waste across parameterized URLs, redirect chains, thin content, and sitemap configuration -- and how to fix each category -- is covered in the comprehensive guide to crawl budget issues for complex web applications. The technical SEO team at 137Foundry uses this data-first diagnostic approach on every site audit, because Crawl Stats patterns almost always point toward the right fix before you've spent time on any manual investigation.

Additional context on crawl behavior is available through Screaming Frog SEO Spider, which simulates the crawl at the structural level and confirms patterns that Crawl Stats identifies only at the aggregate level.

DEV Community