№ XIII·On Evidence·02 July 2026

The citation economy.

AI answer engines do not return ten links. They return one paragraph and four citations. Commerce, quietly, has become a citation contest. The brand whose first-party language is most citation-shaped wins.

BetterReviews Editorial·Studio note

CONTENTS · 08

01The Princeton paper, briefly, before we move on
02What is citation-shaped
03The mechanical reason your product page is not in the answer
04The corpus you already own
05What the moat used to be, and what it is now
06Three corrections to common AI-search advice
07The work, framed plainly
08The closing turn

Two decades or so of the open web has run on a single shape of attention economy, and the shape was a ranking. The query went into the box, ten blue links came out of the box, the link at position one took roughly a third of the available clicks, position two took roughly half of that, three half of that again, and so on down a curve that anyone who has done SEO can sketch from memory. The whole apparatus, the link-building, the keyword research, the technical audits, the content calendars, the agency relationships, sat on top of that one shape. Move up the list. That was the job.

This essay is about what happens when the list goes away.

The thing replacing it is a citation economy, and we mean the word literally. What comes back from a modern answer engine isn't a list at all. It's a paragraph (we've sketched the mechanics of what those engines actually read for in a separate essay on the new reading layer) followed by three or four footnoted sources beneath the paragraph. The user, mostly, reads the paragraph. They might glance at the footnotes; they almost never click one. The merchant whose URL sits in the footnote is acknowledged the way a journal article acknowledges a citation. The merchant whose URL doesn't, isn't.

We'd like to say the numbers under this are subtle. They aren't. Pew Research's 2025 browser study clocked the click-through on traditional results, when an AI summary sits on top of them, at around 8 percent (against 15 percent when the summary isn't present). Adobe Analytics, looking only at commerce, saw retail traffic from generative AI sources climb roughly 1,200 percent in twelve months. OpenAI's own most recent disclosure puts ChatGPT at around 900 million weekly active users, more than twice last year. Perplexity is past a billion queries a month. AI Overviews now sit above some measurable fraction of commercial-intent searches on Google and the fraction grows quarter on quarter. Mechanically, the buyer in your category is no longer being routed to your site to form their own opinion. The buyer is being handed an answer, with footnotes.

A ranking economy rewards the page that ranks. A citation economy rewards the sentence the answer engine is willing to quote. These are different problems, and almost no brand has noticed.

If you accept the premise that this shift has happened, the strategic question for any commerce brand becomes uncomfortably simple. What does my first-party data for AI search actually look like, and is any of it the right shape to be cited?

For almost every brand on the open web, the honest answer is no.

The Princeton paper, briefly, before we move on

Ranking economy · click share by Google result position

0%10%20%30%

12345678910

SERP position

Citation economy · footnotes under an AI answer

CitedCitedCitedCited

The paragraph is written. Four sources survive.

The ranking economy paid a long curve of clicks. The citation economy pays four sources or none.Standard SERP CTR decay · Pew Research, 2025

There is a paper from 2024 that anyone working on commerce in 2026 should have read at least once. The title is "GEO: Generative Engine Optimization". The authors are at Princeton, Georgia Tech, the Allen Institute for AI, and IIT Delhi. It appeared at KDD 2024. It is the cleanest empirical study of what AI answer engines actually choose to cite when they generate a paragraph in response to a user query.

The paper's headline finding, on a benchmark of ten thousand queries against multiple generative engines, is that sources whose content includes citations, quotations, and statistics are surfaced in generative answers up to roughly 40% more often than identical sources without them. The effect is not uniform. Authoritative language matters more for historical and factual content. Statistics matter more for law, government, and health. Citation density matters most when the question is contested or comparative, which describes most product queries.

The paper does several other useful things. It distinguishes between visibility and impression-weighted relevance. It tests the effect of different content modifications independently. It defines a benchmark, GEO-bench, that other researchers have since extended. What matters for our purposes is the underlying mechanism it describes. The engine is reading for shape. The shape is sourceable, specific, first-person, dated, statistically grounded language. Marketing copy is the wrong shape. Customer testimony is the right shape.

If you stop reading this essay and read the paper instead, that is a fair use of your time. We will be here when you come back.

What is citation-shaped

The single most useful operational concept that has emerged in the last two years of commerce work is the idea that some sentences are citation-shaped and others are not. A citation-shaped sentence has certain visible properties. It can be summarised in four words.

First-person. Dated. Signed. Specific.

The first-person voice tells the engine there is a witness behind the claim. The date tells the engine when the claim was made and that it has not been silently re-dated to look fresh. The signature tells the engine the witness is identifiable, that they exist outside this single sentence. The specificity tells the engine the claim is bounded and falsifiable: a particular product, a particular condition, a particular time of use, a particular outcome. Each property is examined in turn in a companion piece.

Almost no brand-written sentence has any of these properties. Almost every verified-buyer review has all four.

This is the asymmetry the citation economy turns on. The brand has, on its own site, two bodies of text. The first is everything the brand wrote: homepage copy, product descriptions, blog posts, FAQ pages, about pages. The second is everything the brand's customers wrote: reviews, support replies, questions, post-purchase notes. The first body of text was paid for, produced to schedule, edited to brand. The second body of text was given for free, written in moments of irritation or enthusiasm, full of typos and unintended specifics.

The engines prefer the second body of text. They prefer it consistently. They prefer it whether or not the brand is paying attention.

A 2025 audit of 23,000 AI citations by Omniscient Digital looked specifically at queries that ask what people think of a brand. On those queries, the answer engines cite earned media (third-party reviews, forums, magazines) about 82% of the time. The brand's own pages get the remaining 18% between them. Inside that 18%, the cited pages are almost never the homepage and almost never the product page. They are review pages, where they exist as first-class content, and they are dated FAQ entries, where the brand has signed and dated an answer to a real question.

A separate study by Writesonic of 2.4 million domains found that the text actually pulled into generative answers has roughly three times the entity density of normal English. Three times. The cited sentences are dense with specifics: brand names, conditions, comparisons, dosages, dimensions, dates. Marketing prose has trained itself for fifteen years to be the opposite of this. Marketing prose has trained itself to be smooth, atmospheric, brand-safe. Each of those qualities is, in the citation economy, a downgrade.

The mechanical reason your product page is not in the answer

It is worth saying this in plain terms, because most brand marketers have not yet sat with it.

When ChatGPT or Perplexity or Google AI Overviews or Claude with web search produces an answer to a product question, the system does roughly three things. It retrieves a set of candidate documents. It identifies sentences within those documents that match the query. It selects the sentences whose attribution is cleanest and whose specificity is highest. Then it composes the paragraph and lists the sources.

Your product page is rarely a candidate for one straightforward reason: the engine cannot find a sentence on it that is first-person, dated, signed, and specific. The product page is written in brand voice, undated, signed by nobody, and deliberately vague enough to apply to as many buyers as possible. It is, by design, the opposite of citation-shaped.

Your homepage is rarely a candidate for a more straightforward reason: it does not contain any specific product claim at all. It contains a value proposition, a hero image, and some logos. There is nothing on it the engine could cite without lying.

Your review widget is rarely a candidate for a third reason: the AI crawlers (GPTBot, PerplexityBot, ClaudeBot, Google-Extended) often do not render the JavaScript that loads the reviews. The reviews are on your page, technically, but they are not in the crawler's view. They are paginated behind buttons the crawler does not click. They are rendered after a fetch the crawler does not wait for. They might as well not exist.

So the engine reaches past your site, finds the same product mentioned on Reddit by a user with a posting history and a date, and quotes that instead. The Reddit user becomes your citation. The brand becomes the entity being talked about, not the entity doing the talking. A close reading of exactly which sentences ChatGPT reaches for, and why, sits in this earlier essay.

This is the citation economy as experienced by a Shopify founder in 2026, whether they have named it yet or not.

The corpus you already own

The thing that is supposed to be encouraging arrives now.

The body of text that an answer engine wants to cite already exists, in almost every direct-to-consumer brand on the open web, in volumes that would take years to recreate. It is the review corpus.

A skincare brand doing two million in annual revenue has, over the course of a single year, somewhere in the region of fifteen to twenty thousand sentences written by its own verified buyers, each one first-person, dated, signed, and specific to a product, a condition, a use-case, and a moment. A homewares brand has descriptions of how a lamp looks on a specific north-facing kitchen counter. A supplements brand has the only honest answers, written by the only people who tried, to the question of whether the thing works.

Each of those sentences was given for free. Each of those sentences is technically owned by the brand. Each of those sentences is, in the eyes of an answer engine, dramatically more valuable than any sentence the brand's marketing team has ever written.

And almost all of those sentences are, right now, invisible to the engines. They sit in a JavaScript-loaded widget. They are paginated. They are surfaced as a star count. They are tagged with sentiment labels and never read again. The asset is there. The asset is not deployed.

This is the central operational mismatch of commerce in 2026. The brand has the citation-shaped sentences. The brand does not present them in a form the citation economy can use. Fixing this is not a content marketing project. It is a publishing project. The brand stops treating its review corpus as decoration and starts treating it as the publication. The widget under the buy button is replaced by, or supplemented with, indexed, rendered, dated, signed pages that carry the verified-buyer language at first-class status.

When that happens, the same engines that currently reach past your site find, on your site, the language they were going to use anyway. They cite it. The citation puts your URL beneath the answer. The reader (not many, but enough) follows the citation. The buyer arrives at your site already believing the answer, because the answer was sourced from your customers.

The math of attribution here is unfamiliar to most marketers, because it is not the math of clicks. It is the math of mention. The citation is the conversion event. The click is a secondary, optional, decreasingly important second step. We will, within a year, be measuring brand health primarily by cited mentions per category-relevant query, not by organic traffic. The infrastructure for that measurement is already being built; Profound, Goodie, and several others ship dashboards for it now. The brands paying attention are wiring it up before the rest of the category notices.

What the moat used to be, and what it is now

For the better part of two decades, the operational moat in commerce content was the SEO budget. A brand spent money on link-building, content production, keyword research, technical SEO audits, and the team to coordinate all of it. The bigger the spend, the bigger the moat. Yotpo, Trustpilot, Bazaarvoice all built blogs of impressive scale on this model. Their domain authority became a structural asset. They ranked on top of every long-tail query in the category and could not be dislodged.

In a citation economy, that moat is, if not dead, dramatically less valuable than it was in 2023. Domain authority is a ranking signal. The ranking is the part that is being replaced. Backlink graphs are a 2010-era proxy for credibility. The 2026 engines are reading the sentences directly. A site with low domain authority and a thousand first-person, dated, signed, specific paragraphs from real customers will, increasingly, be cited above a site with high domain authority and a thousand marketing-voice pillar pages refreshed every quarter.

The new moat is the corpus. Specifically, it is the corpus of sentences your customers wrote that are shaped like primary sources, presented on indexed pages, preserved with their original timestamp, and signed by their original author. This corpus cannot be bought. It cannot be matched by an agency. It cannot be generated by an LLM, because the moment an answer engine detects synthetic verified-buyer language at scale, the engine will downgrade the source, possibly permanently. The corpus must be earned, customer by customer, year after year.

This is, in plain terms, the only durable competitive position left in commerce content. The brand whose corpus is largest, oldest, most diverse, and most carefully published wins more citations than the brand whose corpus is smaller. The brand whose corpus is older has more dated entries, which the engines weight as continuity. The brand whose corpus is more diverse covers more long-tail use-cases, which is where most answer-engine queries actually live.

It also follows, uncomfortably, that any brand without a serious review acquisition program in 2026 is falling further behind every week. The corpus compounds. The lead is durable. There is no shortcut.

Three corrections to common AI-search advice

A lot of consultancy advice in 2025 and 2026 has converged on a few wrong-shape recommendations. It is worth correcting three of them directly.

The first is the suggestion that brands should add more schema markup. Article, FAQ, Product, HowTo, Organization, all stacked on top of each other on every page. This is a 2014 reflex applied to a 2026 problem. The engines that cite first-party data for AI search do read schema, but they read it as a confirming signal beneath the sentence, not as a substitute for the sentence. A page with rich schema and no first-person, dated, signed content will still be ignored. A page with the right kind of content and modest, accurate schema will be cited. Adding schema without changing the underlying text is theatre.

The second is the suggestion that brands should write more long-form content optimised for AI: 3,000-word pillar pages, long-form guides, FAQ-stuffed product pages. This is also a 2014 reflex. The engines are not citing the longest page. They are citing the sentence with the best provenance. A 2,000-word guide written in brand voice will be ignored in favour of a 200-word verified-buyer paragraph with a date and a signature. Length is not the move. Provenance is.

The third is the suggestion that brands should generate AI-written customer testimonials. This is the worst piece of advice in the category. The 2024 FTC rule on fake reviews is the legal floor. The detection technology, particularly inside the engines themselves, has improved fast in 2025 and 2026. The most likely outcome of synthetic testimonials at scale is not invisibility; it is an actively negative signal applied to the brand's domain. The brands that have already been caught are now spending more on cleanup than they ever spent on acquisition. The shortcut is not just dishonest. It is materially bad business.

The work, framed plainly

What does it look like, in practice, for a Shopify founder in 2026 to participate in the citation economy?

It looks like asking, every month, whether the review corpus is growing in volume, in diversity, and in quality. Verified-buyer reviews are the asset; everything else is a metric. If review volume is flat or declining, the corpus is not compounding and the citation position will erode.

It looks like making sure the reviews are published on pages a crawler can render. If the only place a review appears is inside a JavaScript widget loaded asynchronously, the engine cannot cite it. The fix is technical and small: pre-render the review pages, or move the reviews into the page HTML at request time, or generate a sitemap of indexed review pages with proper canonical tags. Most review platforms in 2026 will offer this. The ones that do not are now actively expensive to be on.

It looks like signing the brand's replies. When the brand replies to a verified-buyer review in public, the reply should be signed by a real person at the brand, not by "Customer Support". The reply should carry a date. The reply should be in the brand's voice, but in the brand's actual voice, written by someone willing to put a name to it. These signed, dated replies are, increasingly, the brand's own first-person artefact on the open web.

It looks like resisting the urge to clean up the customer's language. The first person is the asset. Light copy-edits for typos are fine. Paraphrases into marketing voice destroy the citation property of the sentence. The whole point is that the customer said it, not the brand.

It looks like dating everything. Every essay, every FAQ entry, every product page edit, every reply. Dates are cheap to produce and they are the single most underused signal in commerce content. A page that has been visibly maintained at the same URL for three years, with original publication dates preserved, accumulates citation weight in a way that a refreshed-evergreen page does not.

It looks like writing fewer marketing words and publishing more customer words. This is the inversion of a fifteen-year pattern, and it will feel wrong to most marketing teams. The discipline is to do it anyway.

The closing turn

The citation economy is not a forecast. It is the current state of how commerce attention works, today, for buyers using ChatGPT, Perplexity, Google AI Overviews, and Claude. The shift has happened. The brands that will lead the next decade have already started behaving as though it has.

The asset that wins this economy was sitting on their site the whole time, in the form of language their own customers wrote, for free, while they were doing something else. The work is to stop ignoring it.

The work is also, partly, to stop writing as though the old economy still rewards the things it used to reward. The page that ranks does not always get the buyer. The sentence that gets cited does.

Build the corpus. Publish it carefully. Date everything. Sign the replies. Let the customer speak in the first person. The engine, six months from now, will quote you to the next buyer, and the buyer will arrive convinced.

That is what we are quietly building.

You will see it in the summer.

If any of this reads like something your store could use,write to us.

We will write back.

Corrections

corrections@better-reviews.com

Mistakes are listed at the foot of the page when found.