Klaviyo, and the 14-day window that became a four-week window.
The default review-request delay in most flows is a number from a 2018 study that no longer fits the products it gets applied to. The right window is no longer one number. It is a category.
CONTENTS · 09
Open a fresh Klaviyo account, install the review-request flow template, and the delay between order delivery and the request email is 14 days. Postscript ships a similar default. Yotpo's native flow uses 14. Okendo defaults to 14. Junip lets the operator pick and suggests 14 in the onboarding modal. The number is so consistent across the category that operators rarely interrogate it.
The number has a source. Klaviyo published a benchmark study in 2018 across a sample of consumer-goods brands, concluding that the response rate peaked between days 12 and 16 post-purchase. The team called the window "two weeks." The product team picked 14. Every subsequent platform copied the default, because the default of the leading platform becomes the default of the category.
The study was about consumer goods in 2018. The category in 2026 is not consumer goods in 2018. The fourteen-day default is now correct for almost nothing and approximately wrong for almost everything.
What changed under the default
In 2018, "consumer goods" in the Klaviyo sample meant a lot of apparel and accessories, with home goods close behind. The sample had very little skincare, very little supplements, very little electronics. The median product in the study was a thing the buyer used immediately and formed an opinion about in the first few days.
By 2026 the shape of DTC has shifted. Skincare and beauty alone account for a far larger share of Shopify GMV than they did in 2018. Supplements grew through the pandemic and did not retreat. Sleep and recovery categories that did not meaningfully exist as DTC verticals in 2018, alongside the broader longevity and wellness wave, now dominate the post-purchase email queue. These products do not deliver their value in 14 days. Some take 6 weeks. Some take 90 days.
The default was a number tuned to a sample dominated by fast-feedback products. It got copied into a category dominated by slow-feedback products. The mismatch is not a small thing. It is the difference between asking the buyer what the product did and asking the buyer to guess what the product is doing.
The skincare problem
A vitamin C serum delivers visible results in 8 to 12 weeks of consistent use. A retinoid delivers initial irritation in week one and visible results in 10 to 16 weeks. A hair growth serum, on the bottle of every brand in the category, says "use consistently for 12 weeks before evaluating." The American Academy of Dermatology's published guidance on cosmeceutical product trials uses 12 weeks as the standard observation window.
A 14-day review request, in this category, captures a buyer at week two. The buyer has used the product four to seven times. The buyer does not know yet whether it works. The buyer either writes a hedge ("seems good so far, will update") or a complaint about a tertiary attribute (the dropper, the smell, the packaging). The complaint is the loudest signal in the dataset and ends up on the page.
The platform reports a healthy response rate. The brand's product page accumulates twenty paragraphs of "smells nice, will update." Six months later the brand wonders why no buyer talks about results. Nobody asked when the results would have been visible. The flow asked at week two and then never asked again. This is the dashboard-shaped version of timing: the open rate looked good, so the flow looked good, so the timing looked done.
The right window for a leave-on actives product is somewhere between 35 and 56 days. The wrong window is 14. Almost every Shopify skincare brand running the default is in the wrong window.
The apparel pattern
Apparel runs the other way. A buyer who orders a cardigan knows by the second time they wear it whether the fit and the fabric are right. The fit is decided in the dressing room or in front of a mirror. The fabric pills or it doesn't, and the pilling is visible within a few washes.
For apparel, the 14-day window is approximately correct on the slow end and too late on the fast end. A 7-to-10-day window often performs better because the item is still in the buyer's active wardrobe rotation. By week three the buyer has worn it and decided. They have either kept it or stuffed it in the return queue without telling the brand.
The interesting wrinkle is denim, outerwear, and footwear. Heavy items in these categories take longer to evaluate. Boots break in over weeks. Raw denim conforms over months. A jacket goes through one cold snap before the buyer knows whether the insulation holds. For these items, the default 14-day window can produce a review like "haven't worn it much yet, looks nice in the package." A 4-to-8-week window, depending on the item, produces a review that says how the thing held up.
The category called "apparel" is not one window. It is at least three. The flow has to know which.
Supplements and the 30-day wait
Supplements are explicit about timing. Magnesium for sleep, ashwagandha for stress, creatine for training: each comes with manufacturer guidance specifying a multi-week loading or observation period. Creatine's saturation phase is documented at 28 days at maintenance doses, per the International Society of Sports Nutrition's 2017 position stand (still the most-cited reference in the category).
A 14-day review request on a creatine purchase catches the buyer mid-loading-phase. The buyer has no answer to "did it work" because the protocol literally has not finished yet. The buyer answers anyway, because the email asked, and the answer is "too early to tell." Multiply this by a thousand purchases and the product page reads like a clinical trial that nobody completed.
The right window for supplements ranges from 30 days for fast-onset categories (sleep aids, pre-workout) to 90 days for slow categories (longevity stacks, hair). Most platforms cannot represent "90 days" as a default option in their flow builder without manual override. Some require operators to use a custom delay node that the platform's onboarding does not surface.
Electronics, returns, and the same-week ask
Electronics run the inverse problem. A buyer who ordered a pair of earbuds knows in the first listening session whether the fit and the sound are right. By day three the buyer has decided. By day fourteen the buyer has either integrated the product into a daily routine or already shipped it back.
For consumer electronics, the highest-quality review window is often the same-week ask, 4 to 7 days after delivery. Past day ten, the active impression has faded. The buyer will write something more abstract ("good earbuds") rather than something specific ("the noise canceling does not handle wind"). Specific beats abstract for citation, every time. The language a buyer writes is sharpest when the experience is still loud.
The 14-day default, for electronics, is past the peak. By the time the email arrives, the buyer who would have written the most specific review has either stopped paying attention or returned the product. The flow captures the survivors and misses the testers.
What the operator playbook looks like
Five product categories. Five different windows. None of them is 14 days by accident; 14 was the number for a different sample.
A working operator playbook, in 2026, treats the review-request window as a per-category decision, not a platform default. The categories and the windows we keep arriving at, after watching a few hundred Shopify brands ship their own flows.
Apparel, fast-feedback (everyday tops, dresses, accessories): 7 to 10 days post-delivery. The buyer has worn the item, made the verdict, and the experience is still recent.
Apparel, slow-feedback (denim, outerwear, footwear): 28 to 56 days post-delivery. The item needs a season, a few washes, or a break-in period.
Skincare, leave-on actives (vitamin C, retinoids, peptides, growth serums): 42 to 56 days. Long enough for visible results, short enough that the buyer remembers what changed.
Skincare, rinse-off and immediate (cleansers, masks, body care): 14 to 21 days. Closer to the legacy default. Cleansers especially: the buyer notices within a week.
Supplements: 30 to 90 days, with the upper end for longevity and hair, plus joint and cognitive categories. The lower end for sleep and energy, and for gut.
Consumer electronics and accessories: 4 to 7 days post-delivery. Capture the first-use specificity before it fades.
Home and durables (furniture, kitchen, bedding): 21 to 42 days. The product has been lived with long enough to test the claims and short enough that the unboxing impression has not displaced the use impression.
Food and beverage (perishable, repeat-purchase): 5 to 10 days. The buyer has consumed the product or decided not to. Asking later mostly asks about packaging.
These windows are not exact. They are the rough buckets that emerge from watching what buyers actually write at each delay across a corpus. The point is not the specific number. The point is that there is no single number.
The follow-up window most flows skip
The default flow is single-shot. Email goes out at day 14, nudge goes out at day 21, the flow exits. The buyer who did not write a review in the first three weeks falls out of the queue.
The follow-up window is the second ask, set to fire at the point where the slow-feedback category actually delivers feedback. For a vitamin C serum, this is day 56. For a growth serum, this is day 84. For a longevity supplement, this is day 90. The follow-up is not "did you forget to leave a review?" It is "you've been using this for two months. What changed?"
The second email is, in our experience, the source of the citable paragraphs. The first email catches the small fraction of buyers who write reviews unprompted at week two. The second email catches the much larger fraction who needed the product to have actually done something before they had anything to say. The first email produces "smells nice, will update." The second produces "I have used this every morning for two months and the tone of my forehead has changed in a way I did not expect."
Klaviyo and Postscript will run a second flow. So will Sendlane and Yotpo. The operator has to build it. The platform does not ship it as a default.
The post-return follow-up
One more window, which most flows actively block. A buyer who returns an item and then re-buys the corrected size, color, or variant has the most useful story to tell about the brand. The return-and-rebuy buyer has been through the failure mode and out the other side. Their review reads like "I returned the medium because it ran small and exchanged for large. Now it fits and I wear it constantly." This is the most citable paragraph a corpus can produce, because it answers the next buyer's pre-purchase question directly.
Most flows pause the request when a return is initiated. The pause is the right move at the moment of the return; nobody wants a review request landing while the buyer is filling out an RMA form. The wrong move is leaving the pause in place forever. The right move is firing the request 14 to 21 days after the replacement order arrives, with a subject line that acknowledges the exchange. The FTC's 2024 anti-suppression rule treats the suppression of post-return reviews as a violation when applied as a blanket policy. The window the platform pauses indefinitely is the window the rule says you cannot suppress.
The corpus is richer when these buyers get asked. The page is more honest. The flow is harder to build, because it requires the platform to know that an order has been returned, an exchange has been completed, and the new order has been delivered. Most platforms do this through manual flow nodes. Software that remembers would do it automatically.
The closing turn
The 14-day default is not wrong because it was wrong in 2018. It is wrong because the sample it was tuned on has been swapped underneath it. The brand that runs the default in 2026 is asking the wrong question at the wrong time across most of its catalogue and wondering why the answers are thin.
The fix is not a smarter default. The fix is treating the window as a question the operator answers per category, per product, per use case. A flow that fires at the moment the buyer has something specific to say produces a corpus that reads like testimony. A flow that fires on day 14 because day 14 was the number on a benchmark slide produces a corpus that reads like a survey. The work, as ever, is in the brief, not the timer.
If any of this reads like something your store could use,write to us.
We will write back.