8 Things Ahrefs' Billion-Data-Point AI Search Study Really Tells Us

Contents

1 1. List-format content gets cited most, but “make more lists” is the wrong takeaway

2 2. “67% of citations are uninfluential”, read the denominator before you internalize this

3 3. The top-10/AIO citation overlap has fallen from 76% to 38%, and the direction matters more than the exact number

4 4. 85% of pages ChatGPT retrieves are never cited, and almost no one is measuring that gap

5 5. Schema didn’t move citations, but the study can only tell you so much

6 6. YouTube is the most-cited domain in Google AI Overviews, with a platform-specific asterisk

7 7. AI Overviews are URL-volatile, but the answers stay consistent, which changes what you should be measuring

8 8. ChatGPT has 12% of Google’s search volume, but sends 190x less traffic

9 The thread running through all of this

Key Takeaways

List-format content gets cited disproportionately in AI Overviews, but the data doesn’t tell you to start making lists. It tells you to become the source list’s reference.
67% of ChatGPT’s top citations are domains most brands can’t influence, but that stat is measuring the whole corpus, including every query where a brand was never the right answer. Read it carefully before you internalize it.
The top-10/AIO citation overlap has dropped from 76% to 38% in under a year. AI search and Google search are now running on partially separate discovery tracks, and your measurement model probably hasn’t caught up.
85% of the pages ChatGPT retrieves never appear in the final answer. Most GEO advice stops at retrieval. That’s the wrong place to stop.
Schema markup didn’t move citation numbers in Ahrefs’ controlled study, but the study only tested pages already being cited 100+ times. What it tells us about pages outside that set is limited.
YouTube is the most-cited domain in Google AI Overviews, with 20.9% citation share and 34% growth in six months. Quattr’s own AIO panel shows the same directional shift: video-format presence nearly doubled during the May 17–31 window. But this is still largely a Google AIO story. Gemini and Copilot barely cite YouTube at all.
ChatGPT has 12% of Google’s search volume, but sends 190x less traffic to websites. Volume and traffic are not the same thing in AI search.

Ahrefs has published a lot of AI search research over the past six months. Taken together, the studies cover over a billion data points across AI Overviews, ChatGPT, AI Mode, Perplexity, and brand visibility signals. The CEO posted a ten-point summary thread that got widely shared, and most of the coverage that followed treated it as a clean playbook: do these things, get cited more.

I don’t think that’s the right read. Some of the findings are strong and backed up by other research. Some have limited the original posts’ acknowledgment, but the summaries drop. And at least one stat, the most-cited one, is almost certainly being misread by the people building strategies around it.

Here’s what I actually think the data says.

1. List-format content gets cited most, but “make more lists” is the wrong takeaway

Ahrefs found that “best X” list-format content is disproportionately cited in AI Overviews. This finding holds up. Independent benchmark data from Averi identifies “mentions in best listicles” as one of the top five consistent citation drivers across LLMs, so it’s not something only Ahrefs is seeing.

But there’s a difference between understanding why something gets cited and knowing what to do about it.

The brands appearing inside those lists aren’t the ones who wrote the lists. They’re the ones with sufficiently recognized authority, product presence, or third-party coverage to be included by list-makers. The citation went to the list. The underlying signal came from the brand being worth including.

Optimizing your own content format to look like a list is a second-order move. Earning the kind of presence that gets you named inside other people’s lists is the first-order one. The data support the pattern. What you do with it is a strategic judgment the data doesn’t make for you.

2. “67% of citations are uninfluential”, read the denominator before you internalize this

This is the finding that’s been referenced most, and the one most likely to lead teams in the wrong direction.

Ahrefs found that 67% of ChatGPT’s top 1,000 cited domains, Wikipedia, major news homepages, institutional sites, are effectively off-limits to most brands. The framing in most coverage: two-thirds of the AI citation landscape is already decided.

Anurag Singhal, Quattr’s founder, put the problem with this read directly: the study looks at the top 1,000 citations across all queries. That population naturally includes thousands of queries where a brand was never a plausible answer, navigational queries, informational queries, and queries with no commercial intent. Wikipedia and CNN are going to dominate that mix by construction.

The number that would actually inform a content strategy is: of the citations in your category, for queries where your brand is a reasonable answer, what percentage are genuinely contestable? That’s not a question this stat answers.

67% may be accurate as a corpus-level average. It’s not a ceiling for what your brand can achieve in the space where you actually compete. Don’t let it function as one.

3. The top-10/AIO citation overlap has fallen from 76% to 38%, and the direction matters more than the exact number

In mid-2025, Ahrefs found that 76% of pages cited in Google AI Overviews also ranked in the top 10 for the same query. By early 2026, that number had dropped to 38%. BrightEdge puts the overlap even lower, around 17%, depending on methodology.

Ahrefs is upfront about one thing worth knowing here: some of this drop reflects improved citation detection in their tooling, not purely a change in Google’s behavior. The two datasets aren’t directly comparable on a one-to-one basis.

Even accounting for that, the direction is unambiguous. AI Overview citations are increasingly pulling from pages outside the top 10 — and in some cases from pages with no meaningful Google organic visibility at all. The Ahrefs data puts that last category at 28.3% of cited pages. Whether the precise number moves as measurement improves, the structural point stands: AI search is surfacing content that traditional ranking systems wouldn’t.

If your visibility measurement only tracks rankings, you have a blind spot that’s getting larger. That’s the practical implication, regardless of where the exact percentages settle.

4. 85% of pages ChatGPT retrieves are never cited, and almost no one is measuring that gap

This is the finding that got the least coverage and may matter most operationally.

Ahrefs showed that ChatGPT retrieves far more pages than it cites in the final answer. Independent research from AirOps, covered by Search Engine Land, puts a number on it: across 548,000+ pages retrieved, only 15% appeared in the final response.

Retrieval is one filter. Citation is a second, harder filter. Most current GEO advice, structured content, schema, crawlability, E-E-A-T signals, is advice about clearing the first filter. Very little of it addresses what determines whether a retrieved page survives the cut to citation.

The AirOps research points to a few signals that appear to influence the citation selection: title-to-query alignment, content positioning (44% of citations draw from the first 30% of a page), and freshness (content updated in the past three months is cited roughly twice as often). These come from a separate study, not the Ahrefs dataset, so they’re worth knowing about but not worth treating as settled rules.

The broader point holds regardless: if you don’t know how often your pages are retrieved versus cited, you don’t have a complete picture of where the leak is.

5. Schema didn’t move citations, but the study can only tell you so much

Ahrefs tracked 1,885 pages that added JSON-LD schema, compared them against 4,000 control pages, and found no meaningful citation increase across AI Overviews, AI Mode, or ChatGPT.

The finding is probably right for what it tested. But there’s an important limit to what it tells us: every page in the study was already receiving 100+ AI Overview citations before schema was added. These were already pages that AI systems had noticed and were regularly pulling from.

The study doesn’t tell us whether schema helps a page get noticed in the first place. It tells us that for pages that are already being cited heavily, adding schema doesn’t increase their citation frequency. Those are different questions.

Schema may still help in other ways, making it easier for AI systems to read a page accurately, improving how a page is represented when it is cited, and supporting rich results outside of AI. Whether it does those things is something this study wasn’t set up to measure.

6. YouTube is the most-cited domain in Google AI Overviews, with a platform-specific asterisk

YouTube now holds 20.9% of all Google AI Overview citations, according to Ahrefs’ most recent domain tracking, and that share has grown 34% over six months. It’s the most-cited domain in AIOs, ahead of Wikipedia and Reddit.

The Ahrefs Brand Radar research found that YouTube mentions, across titles, transcripts, and descriptions, are the strongest correlating factor with AI Overview brand visibility across 75,000 brands studied.

Two things worth knowing before you build a YouTube strategy around this:

First, this is primarily a Google AI Overviews story. Gemini and Copilot cite YouTube at a fraction of that rate. If your AI visibility priority is ChatGPT or Perplexity, the YouTube signal is considerably weaker.

Second, correlation isn’t prescription. The finding shows that brands with strong YouTube presence tend to show up more in AIOs. It doesn’t isolate YouTube as a causal driver versus a proxy for broader brand authority. Brands with a strong YouTube presence often have a strong presence elsewhere, too.

What’s defensible: YouTube is a real surface in the Google AIO citation ecosystem, and if your brand has no YouTube footprint, you’re absent from a channel Google is actively pulling from.
What’s less clear: whether creating YouTube content specifically for AI visibility, absent an audience-driven reason to do it, moves the needle.

Quattr’s cross-client SERP panel data supports the directional finding, and adds a timing dimension.

Quattr Cross-Client SERP Panel · USA · May 17–31, 2026 · owner_review_required

AIO Query Presence by Content Format — May Core Update Window

Number of queries (out of 126K+ fixed panel) where each content format appeared in an AI Overview, by day

Video

Article

Forum

Homepage

Social

Video AIO query presence grew +44% across the May Core Update window — from 11,760 queries on May 21 to 16,957 by May 31. The Day 4 surge (May 25) drove the sharpest jump: +3,559 queries in 24 hours. Article followed at +27%. Homepage and social were flat. Observed slot presence only — no CTR or traffic attribution implied.

Tracking AIO query presence, the number of queries where each content format appears in an AI Overview, across 126,000+ fixed US queries, video AIO query count stood at 11,760 on May 21, the day the May 2026 Core Update launched.

By May 31, it had reached 16,957. That’s a +44% gain in ten days, with the sharpest single-day jump occurring on Day 4 of the update (May 25), when video AIO query presence moved from 12,251 to 15,810 overnight. Article content followed a similar pattern, gaining +27% in AIO query presence across the same window.

Video specifically is the format the update expanded into AI Overviews. Whether that’s YouTube by construction or video-format content more broadly is a question worth watching as the update fully resolves.

7. AI Overviews are URL-volatile, but the answers stay consistent, which changes what you should be measuring

Ahrefs tracked AI Overview responses over time and found that while the specific URLs cited rotate frequently, the semantic content of the answers remains largely stable. Different pages, same meaning.

This finding makes intuitive sense given how AI Overviews work, the system synthesizes an answer and selects supporting citations, rather than promoting specific pages to a fixed position. But I haven’t seen this backed up by other research yet, so I’d treat it as a directional observation rather than a settled fact.

What it suggests, if it holds: a binary “is my URL cited or not” metric misses the actual question, which is whether your brand’s perspective is represented in the answer. That’s a harder thing to measure, but it may be the more meaningful one. Whether you track it that way depends partly on your tooling and partly on how much you trust this finding, which I’d describe as plausible but not yet confirmed by other research.

8. ChatGPT has 12% of Google’s search volume, but sends 190x less traffic

The volume comparison is real: Ahrefs calculated ChatGPT at roughly 12% of Google’s search volume for traditional search-like queries. Other estimates put it higher, around 17-18% of Google’s daily query count, depending on methodology.

What most summaries of this finding understate: Google sends 190 times more traffic to websites than ChatGPT, despite that volume gap being relatively modest. ChatGPT keeps users in the conversation. It answers the question rather than routing people to a page.

That doesn’t mean ChatGPT visibility is unimportant. Brand mentions inside ChatGPT answers, even without a click, are a form of visibility that influences perception and recall. The question is: what kind of return are you optimizing for? If you’re measuring AI search success in referral traffic, ChatGPT will disappoint you regardless of how often your brand appears. If you’re measuring brand presence in the answers people get, it looks different.

The more interesting data point may be the trajectory. ChatGPT’s share has actually declined relative to Gemini over the past year, with Gemini more than doubling its share between early 2025 and early 2026. The AI search landscape is less stable than the “12%” headline implies.

The thread running through all of this

The Ahrefs research is genuinely useful. But the most common failure mode in how it gets applied is treating descriptive findings as prescriptive ones, seeing “list content gets cited” and deciding to make lists, seeing “YouTube correlates with AIO visibility” and deciding to start a channel.

The more useful read is structural. AI search systems retrieve a huge number of pages and cite very few. The selection criteria favor fresh content, recognized brands, pages that can be grounded accurately, and sources with presence across multiple surfaces. The teams that show up consistently aren’t gaming any one signal; they’re building the kind of underlying authority that scores well across all of them.

What that means for your specific content strategy depends on where your gaps are. The data points toward the gaps. Closing them is the work.

Want to know where your brand actually stands in AI search, not just Google rankings? Quattr’s AI Search Visibility platform tracks citation share, mention coverage, and brand visibility gaps across AI Overviews, ChatGPT, and Perplexity.

Request a Demo

About the Author

Mahi Kothari

Mahi Kothari is a Senior Content Strategist at Quattr, an AI-powered SEO platform built for brands competing across both traditional search and AI-generated answers. She works at the intersection of content strategy, technical SEO, and AI visibility, and has spent 5+ years building the systems behind content programs that compound over time, not just the content itself. Her foundational belief: most content programs underperform not because of weak writing, but because the infrastructure behind the writing is treated as an afterthought, the internal linking logic, the refresh cycles, the schema implementation, the architecture decisions made alongside developers. Track record Before Quattr, Mahi led content and SEO at a B2B SaaS company where she built the program from the ground up. In two years: ∙ Organic traffic grew from ~2,000 to 53,000 monthly visits ∙ Keyword footprint expanded from ~4K to 32K ∙ Domain rating moved from 32 to 67 ∙ 300+ content assets managed end-to-end, from brief to publish ∙ Team of 7 writers hired, briefed, and overseen across the full editorial pipeline ∙ Article and HowTo schema implemented across 200+ pages ∙ 100+ high-authority backlinks built through guest posts, with no paid placements ∙ Full site migration to WordPress executed in direct collaboration with developers, including crawl issue resolution and site architecture restructuring What she focuses on at Quattr: At Quattr, Mahi covers the topics that sit at the frontier of how search is actually evolving: Answer Engine Optimization (AEO), Generative Engine Optimization (GEO), LLM SEO, and AI visibility, specifically what it takes for a brand to surface in responses from ChatGPT, Gemini, and Perplexity, not just rank in traditional SERPs. She builds the workflows she writes about, including automation pipelines in n8n and content structured deliberately around how large language models retrieve and interpret information. Her writing spans the full funnel: foundational explainers on how AI search works, BOFU content that helps teams evaluate tools and make buying decisions, and operational content on internal linking at scale, content refresh frameworks, and AI visibility measurement. Credentials BBA degree. Pursuing an AI-Enabled Digital Marketing & MarTech certification from IIT Roorkee. HubSpot certified in Marketing Hub and AI for Marketers.

About Quattr

Quattr is an AI-native Search Visibility Platform founded in Palo Alto, California, built for mid-market and enterprise brands competing in the age of generative search. Recently recognized across G2's Spring 2026 reports with #1 rankings in AEO Results, Usability, and Relationship, Quattr helps brands win visibility across traditional search and AI-generated answer surfaces.

Quattr's AI agent, GIGA, evaluates content the way AI systems do, identifying gaps across structure, authority, internal linking, and discoverability to surface the highest-impact fixes. With capabilities like autonomous internal linking, E-E-A-T intelligence, and the new GIGA Landing Page Generator for keyword-matched, AI-search-ready pages, Quattr helps teams move from diagnosis to deployed changes without manual bottlenecks.