What an editor would do with your review corpus.
A literary editor does not run sentiment scores on a manuscript. They read it. The same posture, applied to the writing your customers have already done for you, will produce work no dashboard can.
CONTENTS · 06
A literary editor sits down with a manuscript. The manuscript is, let us say, eighty thousand words. The editor does not begin by computing the average sentence length. She does not run a sentiment score on the dialogue. She does not generate a word cloud. She does not produce a chart of pacing.
She reads it.
The reading is purposive. It is looking for a small number of specific things. The sentences that already do the work the rest of the manuscript is reaching for. The contradictions between the narrator's claim and the surrounding evidence. The phrasings the author uses, often without noticing, that mark the voice apart from any other voice working in the same genre this year. The line that, in three days when she returns to the manuscript with her notes, will be the line she remembers without checking.
The editor does not produce, as her primary deliverable, a summary. She produces a list of about forty sentences, copied verbatim from the manuscript, annotated. Some are marked as the spine. Some are flagged as load-bearing for the next round. Some are circled as candidate pull-quotes. Some are highlighted as evidence of a habit the author needs to break.
This is review corpus analysis, although neither the editor nor the author would call it that. It is the closest available analogue, in any working profession, to what a merchant should be doing with their reviews.
An editor would not aggregate. She would read the corpus the way she reads a manuscript: looking for the sentences that already do the work, and the contradictions that need handling, and the lines that catch the whole in a turn of phrase.
This essay is about that posture. It is short by the standards of this Journal, because the argument is small. Stop running sentiment scores. Start reading.
What a review corpus has in common with a manuscript
A merchant's review corpus, if they have been operating for a year or more, is somewhere between twenty and three hundred thousand words. By any honest measure, this is a manuscript. It is longer than most novels. It is shorter than the average literary biography. It is written, certainly, by more authors than a manuscript usually has. The number of authors changes the texture of the reading. It does not change the underlying activity, which is reading.
A manuscript has a thesis, sometimes badly stated, often unstated. The reader's job is to identify it. A review corpus has, in aggregate, an implicit thesis about what the product is for, who it works for, and where it falls short. That thesis is not in any one review. It accumulates across the body. The reader's job, again, is to identify it.
A manuscript has its best sentences scattered unevenly. Some are in the first chapter. Some are buried on page two hundred and forty. A literary editor does not assume the best sentences will be the most representative ones, or the most frequent ones. She looks for them where they are. A review corpus has the same property. The most useful customer sentence in your last two thousand reviews is probably not in the most recent five. It might be a paragraph from eight months ago that nobody noticed.
A manuscript has its contradictions, which are the parts that most need editorial work. A review corpus has its contradictions too, and they are, for the merchant, the most productive part of the entire archive. A contradiction between two reviews is a place where the product, or the marketing, or the customer's expectations, have not yet aligned. Naming the contradiction is the first step toward resolving it.
A manuscript has its voice. Each author has a habit of phrasing, a few words they reach for too often, a structure they prefer. A review corpus has a voice in this same sense, less individual, more collective. The recurring words customers use to describe their problem before they had your product. The phrases they reach for when describing what it did. The metaphors that recur without being prompted. These are the voice of your buyer, in the buyer's own words. Reading for this voice is, in some sense, the single most valuable thing a merchant can do with their corpus. It is also the thing the dashboard cannot do.
The implicit pull-quotes
If you read fifty reviews from a healthy product, with attention, you will encounter somewhere between three and seven sentences that you instinctively reach for a pen to underline. These are the implicit pull-quotes. They were not written to be quoted. They were written by a customer making a point, in their own time, in their own voice. They turn out, on a re-read, to do the work that an entire paragraph of marketing copy has been trying and failing to do.
A literary editor, reading a manuscript, marks these as candidate epigraphs, as section headings, as the lines to put on the back jacket. The publisher's marketing team then uses them. The author's voice carries into the cover quote.
The same activity, applied to a review corpus, produces the sentences that should sit at the top of the product page, in the next ad, in the body of the next email, in the answer the brand publishes to a frequently asked question. The customer wrote the sentence already. The merchant has only to surface it, with attribution, in a form the new search can read.
The merchants who do this well develop, over the course of a year, a small library of customer pull-quotes per product. Twenty for the bestseller. Twelve for the second-bestseller. Three or four for everything else. These quotes are dated, attributed, verifiable. They get rotated into the page. They get cited, externally, by ChatGPT and Perplexity, because they have the five properties the engines weight up.
A six-figure annual customer-acquisition budget has been spent, by a great many brands, to produce a piece of marketing copy less effective than the pull-quote from the buyer who wrote four sentences in May.
The contradictions are the next product decision
The second thing an editor pulls from a manuscript, after the candidate pull-quotes, is the contradictions. The places where the author has said two incompatible things in two adjacent passages. The places where the narrator's reliability has slipped. The places where the evidence the chapter supplies is not the evidence the chapter claims.
A review corpus is full of these. They are almost never visible from the dashboard. They are visible only to someone reading.
A customer says the product works in winter. Another customer, three weeks later, says it does not work in winter. Both are five-star reviews. Both have similar verified-buyer markers. The dashboard averages them and reports a four-point-eight rating. The corpus, read carefully, tells you that the product probably works in winter for one skin type and not for another, and the marketing page is not specifying which. The fix is not a sentiment score. The fix is to write the page more specifically.
A customer praises the product for a use-case the brand has never promoted. Another customer complains that the product does not deliver on a use-case the brand has been promoting heavily. The dashboard, again, reports a tidy average. The corpus tells you that the brand has been marketing the wrong use-case for the customer they actually have. This is, on its own, a quarter's worth of marketing pivot, hiding in plain text on the site.
A customer reports a problem that the next twenty reviews do not mention. The dashboard files this as an outlier. The corpus, read carefully, sometimes tells you that the problem is real and that the next twenty customers did not notice yet, because the problem only emerges after a particular duration of use. Reading the corpus is, in this case, the first warning the merchant has of a quality issue.
None of these readings are produced by sentiment analysis. Sentiment analysis is, by design, a smoothing operation. It compresses. The contradictions live in the unsmoothed text, between the lines, in the places a chart cannot find.
Sentiment is what is left of a review after you have thrown away the part worth reading.
This is why the dashboard is not the work. the dashboard is not the work makes the longer case. The shorter case is that the dashboard's confident summary statistic is, in the precise sense, the wrong instrument for the underlying material. A reading practice is the right instrument. It happens to be a practice merchants stopped having, because the tools stopped offering it.
The phrasings the buyer used before they had your product
There is one more thing the editor does, which is the most subtle and, in our reading, the most valuable.
She reads for the phrasings the author uses when describing the problem the book is trying to address, before the book has addressed it. She is looking for the buyer's language, in the buyer's own register, for the state they were in before the solution arrived.
A review corpus contains this in volume, and almost no merchant has read for it.
A skincare customer writes, in their second sentence, that their skin had been tight in the morning, under SPF, since the seasons changed in October. The customer is describing the condition they had before they bought the product. Those eighteen words, exactly as written, are the phrasing the next buyer is going to type into the search box. Tight in the morning under SPF after the seasons changed. The merchant who knows this can write a product page that answers that exact phrasing, in the buyer's exact words, with the customer's verbatim sentence pulled in as evidence.
The agency that writes the brand's content cannot produce this phrasing, because the agency is paid to write what the brand wants to sell, not what the buyer is feeling at three in the morning before they paid for anything. The phrasing is, structurally, the merchant's to claim, and they can only claim it by reading the people who have already written it.
This is review corpus analysis with the smoothing turned off. The buyer's pre-purchase voice is a body of marketing copy the brand could not pay for, and could not write, and could not commission. It already exists. It is in the corpus. The work is the reading.
The shape of a working session
If you are persuaded by the above and want, on a practical level, to try the posture for a week, this is the shape of session we have arrived at.
You sit down with the last two hundred reviews of your bestselling product. You read them in date order, not in star order. You read them with a notebook open, by hand if you can manage it. You are looking for four things, in this order: candidate pull-quotes, contradictions, recurring buyer phrasings before the product, and the small unrepeatable detail that marks a particular customer as real. You do not aggregate, you do not score, you do not summarise. You quote and you annotate.
At the end of the session, which will take about an hour, you have, on paper, a small editorial deliverable. Eight to twelve pull-quotes, attributed, dated. Two or three contradictions named, with citations from the corpus. A list of about twenty buyer phrasings from before the purchase, verbatim, in the order they appeared. A few customer specifics that struck you as particularly real. This is your editorial output for the week.
You take that output and you do something with it. The pull-quotes go onto the product page, into the next email, into the cited content the new search can read. The contradictions become the next round of marketing copy adjustments. The buyer phrasings become the long-tail content you publish, in the buyer's words, as answers to the questions they were already typing. The customer specifics become the texture that distinguishes your brand from the next brand in your category.
Repeated weekly, this is a working editorial practice for a small store. It produces more usable brand writing in a month than most agencies produce in a quarter, because the source material is already first-person, dated, signed, and credible.
reviews are language not inventory makes the broader theoretical case for why this works. first person dated signed goes deeper on the specific properties of buyer language that become citable in AI search. This essay, in shorter form, is just a description of the posture. Sit down. Read. Mark. Use.
The closing turn
The last thing worth saying is small.
The merchants who do this well, in our experience, treat their corpus the way a magazine treats its archive. With respect. With patience. With the assumption that there is more in it than they have yet noticed. They do not run quarterly sentiment surveys. They do not export to CSV. They open the page and they read.
This is, in 2026, one of the unfashionable activities. It does not produce a chart. It does not feed a slide deck. It does not fit on a quarterly report.
It produces, instead, the line that gets cited by ChatGPT in March, the paragraph that converts the buyer in May, the warning sign about a formulation change in July, the small, accumulating evidence base that makes a brand specific against the smoothed sameness of its category.
An editor would do this with a manuscript without needing to be told.
The same posture, applied to the writing your customers have already done for you, is the work.
If any of this reads like something your store could use,write to us.
We will write back.