The raw HTML your review widget never delivers.
GPTBot does not run JavaScript. ClaudeBot does not run JavaScript. PerplexityBot does not run JavaScript. The reviews on the merchant’s product page are written in JavaScript. Do the substitution.
CONTENTS · 08
- 01What the crawler asked for, and what it got
- 02What a Yotpo product page looks like to GPTBot
- 03What Googlebot sees, and why that does not save the merchant
- 04The data attribute is not the review
- 05A small thought experiment: the brand whose reviews are in the HTML
- 06The retrofit, and why it does not save the widget
- 07What the merchant should check this afternoon
- 08The closing turn
Open a terminal. Type a curl command, pass GPTBot as the user agent, point it at any Shopify product page with a Yotpo widget on it, and press return. What comes back is a wall of HTML. It contains the product title, the price, the description the merchant typed into the Shopify admin, a row of variant pickers, and a buy button. It does not contain the reviews.
The reviews are on the same page. The buyer sees them when she scrolls past the buy button. There are four hundred and eighty-two of them. They have an average rating of 4.7. The merchant pays a hundred and ninety-nine dollars a month for the review platform that collected them. The reviews exist. They are paid for. They are visible to the human in the browser.
They are not visible to the crawler that just asked for the page.
This essay is the explanation of why, and what it means, and what changes when the merchant understands the shape of the gap.
What the crawler asked for, and what it got
A crawler is an HTTP client. It opens a TCP connection to a web server, sends a GET request, and receives whatever the server chooses to return. It does not, by default, do anything else.
OpenAI's GPTBot, as documented in OpenAI's published crawler operator page (updated through 2025 and 2026), fetches HTML. It does not execute JavaScript. Anthropic's ClaudeBot, per Anthropic's own crawler documentation, fetches HTML and does not execute JavaScript. Perplexity's PerplexityBot fetches HTML and does not execute JavaScript. There is a separate Perplexity user agent, Perplexity-User, that fires on a user's live query and behaves more like a browser, but PerplexityBot, the indexing crawler, is not that. Google's Google-Extended is a routing flag for Gemini and Vertex training. The actual fetcher is Googlebot, which does run a headless Chrome and does execute JavaScript, with caveats we will return to.
The default behaviour of the AI crawlers responsible for ChatGPT, Claude, and Perplexity's index, in 2026, is: fetch the HTML, parse the HTML, walk the DOM, extract the text, store the text in a vector index, move on.
If the text is not in the HTML the server returned, the text is not in the index.
This is the entire story. Everything else here is the consequence.
What a Yotpo product page looks like to GPTBot
Take a Shopify product page with a Yotpo widget mounted. The merchant added the widget by pasting a small block of HTML and JavaScript into the theme. The HTML the server returns, where the reviews are supposed to be, is roughly six lines long.
It begins with an opening div tag. The div carries a class of "yotpo yotpo-main-widget". It carries a data attribute called data-product-id, set to the Yotpo internal identifier for the product. It carries a data-name attribute set to the product's name. It carries a data-url set to the canonical URL of the product page. It carries a data-image-url, a data-price, and perhaps a data-currency. The div closes immediately. There is nothing inside it. After the div, a single script tag points at a JavaScript file hosted on Yotpo's static CDN, marked async.
That is the entire review section. A div, five data attributes, and a script tag. The div is empty: no reviews inside it, no aggregated rating in plain text, no count, no quoted reviewer sentence. Only the shell.
The shell is meaningful only if the script runs. The script's job is to fetch the reviews from Yotpo's API, render them as HTML inside the div, and update the page in the browser after the initial HTML has loaded. For a buyer with a working browser, that is fine: the script runs, the reviews appear, the buyer reads them.
For GPTBot, the script does not run. The div remains empty. The page, as far as the crawler is concerned, has a product title, a description, a price, and a quiet placeholder where four hundred and eighty-two reviews should have been.
This is not a Yotpo problem in particular. The same shape applies, with only superficial differences, to Okendo (a different attribute name, a different CDN, the same empty shell), to Junip (the same), to Loox (the same, plus an iframe for the photo carousel), to Stamped (the same), to Reviews.io (the same), to Judge.me (slightly better, since Judge.me ships a small server-side block of recent reviews into the initial HTML, but the bulk of the corpus is still injected by JavaScript). The pattern is the category, not any single vendor.
What Googlebot sees, and why that does not save the merchant
A reasonable objection: Googlebot runs JavaScript. Google has been rendering JavaScript reliably since 2015. If the merchant cares about being cited in Google's AI Overviews, the JavaScript widget is fine, because Googlebot will execute the script, the reviews will materialise in the DOM, and the rendered HTML will be indexed.
The objection has a kernel of truth and a larger envelope of mess.
Googlebot's renderer is real. It is also, as Google's own documentation and John Mueller's repeated public statements have noted through 2024 and 2025, on a delay. The crawler fetches the initial HTML, queues the page for rendering, and renders it later. The render queue can be hours, days, or in some cases weeks behind. For a product page that updates with a new review every twelve hours, the indexed version is, structurally, behind. The aggregate count Google shows in the search snippet is the count from the last render, not the current one.
That is the freshness problem. The cost problem is larger.
Google's published guidance (most recently in the 2024 Search Central documentation on JavaScript SEO) is explicit: server-side rendering is preferred. Client-side rendering is supported, with caveats. The caveats include: rendering may be deferred, rendering may fail silently if the script errors, rendering quotas may be hit on high-volume sites, rendering may not pick up content lazy-loaded after a scroll event the headless browser does not perform. A merchant who ships their reviews via a JavaScript widget is, in effect, asking Google to do work for them. Google sometimes does the work, and sometimes it does not.
And that is Google. The crawlers for ChatGPT, Claude, and Perplexity do not have the same renderer, and have, in most cases, no renderer at all. The JavaScript widget is a bet on a renderer that does not exist for most of the AI surfaces the merchant now wants to be cited on.
We have written about the broader shape of this surface in the engine the answer engine reads. What this essay adds is the specific mechanical fact: the engine reads HTML. The reviews are not in the HTML. The engine cannot read what is not there.
The data attribute is not the review
A useful exercise: stare at the data attributes on the empty Yotpo div and ask what they tell a crawler.
The data-product-id attribute tells the crawler that there is a product with an internal Yotpo identifier of, say, 7843921. The crawler does not have a Yotpo API key, so it cannot do anything with the identifier.
The data-name attribute tells the crawler the product is called The Daily Serum. The crawler already knows this; the product name is in the page's title tag and h1.
The data-url tells the crawler the canonical URL of the product page. The crawler is on that page; this is a circular signal.
The data-image-url and data-price are duplications of what the merchant has already declared elsewhere in product schema.
None of these attributes contain a review. None of them contain a paragraph of customer language. None of them contain a date, an author, or a star rating attached to a specific sentence. They are, in informational terms, metadata about reviews that exist somewhere else, owned by a different company, behind an API the crawler cannot read.
The merchant looking at this from the dashboard sees the four hundred and eighty-two reviews. The merchant looking at this from the crawler's user agent sees five strings on a hidden div.
This is the shape of the gap.
A small thought experiment: the brand whose reviews are in the HTML
Suppose, for a moment, that a different store on the same Shopify theme has done a different thing. Their reviews are not loaded by a JavaScript widget. They are written into the product page's HTML at render time. The reviews are paragraphs, dated and signed, sitting in the page's main flow, under the buy button, in plain prose.
When GPTBot fetches that page, it receives the reviews. When ClaudeBot fetches that page, it receives the reviews. When PerplexityBot fetches that page, it receives the reviews. When the next answer engine launches a year from now, it will receive the reviews too, because the reviews are in the HTML, and the HTML is the input layer to almost every system that consumes the open web.
The store with the JavaScript widget pays a hundred and ninety-nine dollars a month to collect reviews that are invisible to those crawlers. The store with the HTML pays whatever it costs to publish the reviews as HTML. The difference, between those two stores, in the long-run citation share they accrue from answer engines, is not a small percentage. It is the difference between being readable and being a blank div.
We argued in reviews are language not inventory that the right way to think about a store's review corpus is not as a feature to be displayed, but as a body of language to be published. The mechanical reason that argument matters is what we have just walked through. The language is what the crawler reads. The widget is the container that prevents the crawler from reading it.
The retrofit, and why it does not save the widget
A reasonable response from a review platform: ship a server-side rendering option. Pre-render a few reviews into the initial HTML. Update them every hour. Solve the crawler problem.
Some platforms have shipped this. Judge.me ships a small server-rendered block of the three most recent reviews in the page's initial HTML, with a "load more" button that triggers the JavaScript widget. Bazaarvoice's enterprise customers can opt into a server-side rendering tier. Yotpo has, in pieces of its premium offering, a server-side snippet. The retrofit exists.
The retrofit is partial. The block in the initial HTML, in every implementation we have looked at in 2026, is small: three to six reviews, perhaps the aggregate rating, sometimes a count. The remaining four hundred and seventy-six reviews are still in the JavaScript widget. The crawler sees the snippet, not the corpus.
The merchant collecting four hundred and eighty-two reviews, in the retrofit case, is publishing six of them to the answer engines. The other four hundred and seventy-six are still invisible. This is better than zero. It is not the thing.
The thing, as we set out in the end of the review widget, is for the page itself to contain the reviews. Not a snippet of them. Not a placeholder for them. Them. In the HTML. As the page.
Every review the brand collects is invisible to the citation layer until the brand publishes it in server-rendered HTML. That sentence is the whole essay. The rest is the explanation of why the sentence is true.
What the merchant should check this afternoon
A short, concrete checklist. The merchant who reads this essay can do it in twenty minutes.
Open a terminal. Run a curl command with the user agent set to GPTBot, fetch the merchant's own bestselling product page, and pipe the output to a file. Open the file in a text editor. Search for the most distinctive phrase from the merchant's top-rated review. If the phrase is in the file, the reviews are server-rendered, and the merchant is in a small minority. If the phrase is not in the file, the reviews are JavaScript-rendered, and the AI crawlers are not reading them.
Do the same with ClaudeBot as the user agent. Do the same with PerplexityBot. Do the same with Googlebot for comparison.
If any of those user agents return a page with a search result for the reviewer's distinctive phrase, the reviews are readable to that crawler. If none of them do, the reviews are, for the purposes of the citation economy, not on the internet.
This is not a strategic question; it is a measurement question, and the merchant can answer it in a tab.
The closing turn
The merchant has been told, for fifteen years, that the review widget is the way the merchant publishes their customers' writing. The widget collects the writing. The widget renders the writing. The widget displays the writing under the buy button.
The widget does not publish the writing.
Publishing is a server-side act. Publishing is rendering plain HTML at request time, with the content in the body of the document, on the merchant's own domain, addressable, indexable, readable. The widget is a UI affordance that asks for the writing in the browser, after the page has loaded, from a third party, in JavaScript. That is not publishing. That is decoration.
A store whose reviews are language has, in 2026, a different job than a store whose reviews are a widget. The job is to put the language on the page. The page is the corpus. The corpus is what the crawler reads. The crawler is what writes the answer the buyer never sees the source of.
The review the merchant collected this morning is in the dashboard. Whether it is on the internet is a different question. The honest answer, for most stores in 2026, is no.
If any of this reads like something your store could use,write to us.
We will write back.