Passage-Level Optimization: How to Win the AI Search Chunk Retrieval Layer in 2026
The page-versus-passage distinction has flipped over the last 18 months. Through 2024 the headline metric was page-level citation share. Through 2026 it is chunk-level retrieval share — because AI engines no longer cite pages, they cite the single 600–900 character passage inside a page that resolves the user’s question. The page is the container; the passage is the unit of retrieval. Brands optimizing at the page level are still leaving the highest-leverage citation lever untouched.

Through 2024 the AI-search citation conversation was about URLs. Through 2026 it is about chunks. Every major engine now runs a chunk-retrieval substrate that segments each crawled page into 6–18 passages of ~600–900 characters, embeds each chunk independently, and retrieves the single best-matching chunk per fan-out sub-query rather than the page as a whole. Roughly 84% of mid-2026 citations resolve to one specific in-page chunk; the page is the host, not the unit. The implication is that a page can win the URL-level click and still lose the passage-level retrieval — and most pages that under-cite do so for chunk-shaped reasons, not page-shaped reasons.
What Engines Actually Retrieve in Mid-2026
A passage embedding is the engine’s vector representation of a single chunk inside a page — not the page as a whole. Every major engine through mid-2026 has converged on a similar substrate shape:
- Perplexity. Splits each page into ~700-character passages with ~80-character overlap. Citations resolve to a specific in-page anchor; the numbered footnote points to the chunk, not the URL alone. The cleanest engine to confirm chunk-level behavior on, because the anchor is human-visible in the citation.
- Google AI Mode. Reuses passage-ranking infrastructure built for Featured Snippets in 2018, now extended into the chunk-retrieval substrate that fills AI Mode answers. Chunk size sits around 600–900 characters; overlap is roughly 100 characters. AI Mode citations link to a
#:~:text=text fragment that highlights the retrieved chunk in the source page. - ChatGPT Search. Splits at slightly larger boundaries (~900 characters) and is more tolerant of mid-chunk semantic shifts than Perplexity or Google AI Mode — but pages with clean heading boundaries still retrieve at materially higher rates.
- Microsoft Copilot. Inherits the Bing passage-ranking pipeline; chunk size and overlap match Google AI Mode closely. Anchor citations land on the same
#:~:text=fragment shape. - Amazon Rufus.Operates over a narrower substrate (PDP descriptions, A+ content, Q&A, reviews) but applies the same chunk-retrieval discipline inside each — a long PDP description is segmented into 3–6 chunks and each is ranked independently.
The right read is not ‘engines rank pages a little more granularly than before’. The right read is that the page is now the container for many independent retrieval units, and your citation share is the sum of each chunk’s independent retrievability. A 3,000-word page might present 12 chunks to the substrate — and only two of them might be retrieval-grade.
How Engines Segment a Page
Chunk boundaries are not arbitrary. The engines that publish their behavior (Google, Microsoft) and the engines that don’t (Perplexity, OpenAI) all converge on the same five-signal segmentation priority order:
- HTML heading boundaries.The single strongest signal. h2 and h3 break the chunk cleanly; h4 occasionally; h1 never (h1 marks the page, not a chunk). Pages with consistent h2/h3 discipline have chunk boundaries that align with the engine’s target chunk size; pages with one h2 followed by a wall of prose hand the engine the segmentation problem and it solves it sub-optimally.
- Paragraph breaks (closing </p>). When a heading-bounded section exceeds target chunk size, the substrate falls back to paragraph boundaries to split it further. Pages with long, comma-spliced paragraphs leave the engine to guess; pages with 200–400 character paragraphs land cleanly inside the target window.
- List boundaries (<li>). Bullet lists split into one chunk per cluster of 4–6 items (or one chunk per list if shorter). Lists that exceed ~8 items split mid-list and the second half loses its context anchor; lists of 4–6 items retrieve cleanly.
- Semantic stop points. When no structural boundary is available inside the target chunk size, the substrate breaks on sentence boundaries that present a semantic shift (topic change, comparison-to-next-idea, summary-of-prior-idea). The quality of this fallback varies by engine and is the single most operator-controllable lever — clean prose gives the substrate clean fallback points; meandering prose forces it to break mid-thought.
- Hard character cap. When all of the above fail to land inside the target chunk size, the substrate hard-caps the chunk at ~900–1,100 characters and breaks mid-sentence. This is the worst-case fallback and the most retrievability-destroying split shape — chunks that end mid-sentence retrieve at roughly 0.4× the rate of chunks ending on a clean structural boundary.
What a Retrievable Chunk Looks Like
A retrievable chunk in the mid-2026 substrate fits five properties simultaneously. The properties are independently measurable; a page audit that scores each chunk against the five is the foundation of any passage-level optimization program.
- ~600–900 characters. Below 400, the substrate treats the chunk as a low-context fragment and down-weights it; above 1,000, the substrate splits mid-chunk and the second half loses its context anchor. The sweet spot lands within the same window the major engines target. Most under-citing pages either run 120-character bullet fragments (too short) or 1,500-character paragraph walls (too long).
- Bounded by a heading. The chunk should start within one sentence of an h2 or h3 and end before the next h2 or h3 begins. Pages whose heading cadence matches the substrate target chunk size have every chunk land on a heading-bounded passage; pages whose headings are 2,500 characters apart force the substrate into mid-section paragraph splits.
- One claim per chunk. Each chunk should present one main claim, supported by one or two quantified or named-entity evidence sentences, and one synthesis sentence. Chunks that try to present three comparable claims in parallel under-cite by 0.5–0.7× relative to chunks that present one claim cleanly.
- Self-anchoring opening sentence.The chunk’s first sentence should restate the page context the chunk operates inside — without it, the retrieved chunk arrives at the engine as a context-less fragment. ‘The mid-2026 inline image carousel surfaces on roughly 35% of commercial queries on Perplexity’ is self-anchoring; ‘It surfaces on roughly 35% of those’ is not. Self-anchoring chunks retrieve at 1.5–2.0× the rate of context-less chunks at the same chunk size.
- Closing-sentence finality.The last sentence in the chunk should land on a complete thought, not trail mid-paragraph into the next idea. Pages whose chunks always end on a finished thought retrieve materially better than pages whose paragraphs run continuously across heading boundaries — because the substrate’s fallback boundary is forced to break at a worse split point.
The Chunk Audit Table
The right artifact for tracking passage-level optimization is a chunk audit table — the page-level analog of the rationale snippet audit and the multimodal capture table. Seven columns:
- Page slug. One row per page on the priority set the visibility dashboard tracks against.
- Chunk index (0..n). Number each chunk as the engine would segment it — by h2/h3 first, paragraph second, list third. A 3,000-word page with clean heading discipline runs 8–14 chunks; the same page with one h2 and paragraph walls runs 3–4 oversized chunks the substrate then splits arbitrarily.
- Char count. Verbatim character length of the chunk as written. Anything under 400 or over 1,000 should flag.
- Heading-bounded? (Y/N). Does the chunk start within one sentence of an h2 or h3 and end before the next?
- Self-anchoring opening? (Y/N). Does the first sentence restate the page context without requiring the prior chunk?
- One claim? (Y/N). Or does the chunk try to compare three things in parallel?
- Cited? (engine + week). Was this specific chunk surfaced as the retrieved passage on any priority query this week, on any of Perplexity, Google AI Mode, ChatGPT Search, or Copilot? The text-fragment anchor in the citation tells you which chunk landed.
Two optional columns lift the table: the entity density per chunk (named entities — brand names, product names, numbers, dates — per 100 characters), which positively correlates with retrieval rate; and the chunk’s proximity to a heading-bounded sibling chunk that did retrieve, which surfaces clusters of retrievable passages on adjacent headings.
The Five Failure Modes That Repeat
On any chunk audit of a real page set, the same five chunk-shape failure modes show up in roughly the same proportions:
- Wall-of-prose chunks. A 2,400-character paragraph that the substrate splits in two arbitrary ~1,200-character segments; both segments retrieve poorly because neither one self-anchors. Roughly 35% of mid-market pages have at least one wall-of-prose chunk on their highest-traffic page. Fix: break into three paragraphs separated by an h3.
- Bullet-fragment chunks. A list of 12 bullets, each 80 characters long, that the substrate treats as 12 separate low-context fragments. Each fragment is too short to anchor; none retrieve. Fix: group the bullets into 3 clusters of 4 with an introductory sentence per cluster.
- Context-stripped openings.Chunks starting with ‘It also helps’, ‘Another reason is’, ‘On top of that’ — the substrate retrieves the chunk and the user reads it with no idea what ‘it’ refers to. Fix: rewrite the opening sentence to restate the page-level entity.
- Mid-thought endings. Chunks ending mid-paragraph, with the supporting evidence in the next chunk. The retrieved fragment presents the claim without the proof. Fix: keep claim and evidence inside one heading-bounded section.
- Multi-claim parallel chunks. A chunk that compares three competitors, presents three pricing tiers, or lists three different use cases inside a single 800-character paragraph. The substrate struggles to attribute the retrieved passage to a single claim and down-weights all three. Fix: split into three separate heading-bounded chunks.
The Six-Week Passage Rewrite Program
For a brand standing up passage-level optimization from scratch, the cleanest sequencing in mid-2026:
- Week 1: Lock the priority page set (the same 40–120 priority pages the rest of the AI-search stack scores against) and run the chunk audit table on each. Baseline is two numbers: average chunks per priority page, and percentage of chunks that pass all five retrievability properties.
- Week 2: Rewrite the top 10 highest-traffic priority pages into heading-bounded chunks. Most pages move from 3–4 oversized sections to 8–14 retrievable chunks; word count usually does not change, only structure.
- Week 3:Rewrite chunk openings on the same 10 pages to self-anchor. Every chunk’s opening sentence should restate the page-level entity and the chunk’s specific claim — without requiring the prior chunk for context. This is the single highest-leverage rewrite layer.
- Week 4: Audit the next 20 priority pages with the same chunk audit table; rewrite the worst 10. By the end of week 4 the program has covered the top 20 pages on both structure and opening self-anchoring.
- Week 5: First weekly retrieval capture. For each priority query, capture which chunk on which page surfaced as the retrieved passage on Perplexity, Google AI Mode, ChatGPT Search, and Copilot. The text-fragment anchor identifies the chunk. Tag whether the cited chunk is one of the rewritten 20.
- Week 6:First measurable lift. Most programs report a 20–35% increase in passage retrievals on the rewritten pages by week 6, faster than the 8–11 week curve on full-page rewrites because passage-level changes propagate through the substrate’s next embedding refresh — a 3–5 week cycle on the major engines.
The wider gain — ‘our priority page set retrieves consistently at the passage level across all four major engines’ — usually lands at week 10–14 once the next 40–60 pages have been rewritten on the same chunk discipline.
Why the Page-Versus-Chunk Distinction Matters Operationally
The first failure mode in a page-level optimization program is investing in new URL-level pages when the existing pages are under-citing for chunk-shape reasons. Adding a new page when an existing page has six unretrievable chunks doubles the substrate’s crawl load without lifting citation share. The right sequencing is to fix the chunks on the existing priority pages first, then expand the URL-level coverage once the chunk discipline is baked into the editorial pattern.
The second failure mode is editorial guidelines that optimize for human readability of the full page rather than chunk-level retrievability. The two goals are compatible — heading-dense, self-anchoring, one-claim-per-section prose reads cleanly for humans too — but most editorial style guides through 2024 were written before the chunk substrate existed and reward long, flowing sections that read like a magazine. The rewrite is not against readability; it is against the legacy magazine-style cadence.
Where Passage-Level Optimization Sits in the Stack
Passage-level optimization is the chunk-side discipline of the 2026 AI-search content operations stack. It composes with the other artifacts already shipped:
- The AI visibility dashboard locks the priority query set the chunks are scored against — without that lock, the chunk audit drifts and the rewrite scope inflates.
- The rationale snippet audit identifies the rationale density inside the retrieved chunk. Passage-level optimization makes the chunk retrievable; the rationale audit makes the chunk citable once retrieved.
- The brand entity graph audit fixes the entity disambiguation the chunk’s self-anchoring opening sentence relies on. Without the brand entity layer, the substrate cannot resolve the chunk to the brand reliably across queries.
- The multimodal answer optimization playbook fills the carousel slot beside the retrieved chunk. The text chunk and the image chunk are scored independently; both winning is what unlocks the full citation slot.
- The llms.txt implementation maps the engines to the priority pages; passage-level optimization is what makes the chunks inside those pages actually retrieve once the engine has routed there.
Together the six artifacts form the operations stack a mid-2026 AI-search program runs on. Passage-level optimization is the artifact that lifts citation share on the existing page set faster than any other artifact — most rewrites land their first measurable lift at week 6, half the time of any URL-level expansion.
Frequently Asked Questions
What does chunk-level retrieval actually mean?
Every major AI engine through mid-2026 segments each crawled page into 6–18 passages of ~600–900 characters, embeds each chunk independently, and retrieves the single best-matching chunk per fan-out sub-query rather than the page as a whole. Roughly 84% of citations resolve to one specific chunk inside a longer page — the page is the host, not the unit of retrieval. A page can win the URL-level click and still lose the passage-level retrieval if the chunks inside that page do not segment cleanly.
What is the target chunk size the substrate retrieves at?
600–900 characters is the sweet spot across Perplexity, Google AI Mode, ChatGPT Search, and Copilot. Below 400, the substrate treats the chunk as a low-context fragment and down-weights it; above 1,000, the substrate forces a mid-paragraph split and both halves lose their context anchor. h2 or h3 cadence every ~600–900 characters lands every chunk on a heading-bounded passage — the cleanest possible split shape.
What makes a chunk retrievable?
Five properties simultaneously: ~600–900 characters; bounded by an h2 or h3; one main claim per chunk; self-anchoring opening sentence that restates the page-level context; and a closing sentence that lands on a complete thought. Pages whose chunks pass all five retrieve at materially higher rates than pages whose chunks pass three or four.
How do I find which chunk on a page was actually cited?
The text-fragment anchor in the citation URL tells you. Google AI Mode, Microsoft Copilot, and ~80% of Perplexity citations append a #:~:text= fragment that highlights the exact passage that was retrieved. Visit the citation URL with the fragment intact — the browser scrolls to and highlights the cited chunk.
Should I rewrite my existing pages or publish new ones first?
Rewrite first. Adding new URLs while existing priority pages have under-cited chunks doubles the substrate’s crawl load without lifting citation share. The rewrite-first cohort lifts citation share 2–3 weeks ahead of the new-URLs-first cohort on the same content budget.
How long until I see passage-level lift after rewriting?
Most programs report the first measurable lift in passage retrievals on the rewritten pages at week 5–7, faster than the 8–11 week curve on full-page rewrites because passage-level changes propagate through the substrate’s next embedding refresh — a 3–5 week cycle on the major engines.
Ship the chunked, passage-retrievable visual + textual stack the substrate now rewards
ppl.studio is the production layer most performance teams now use to fill the persona-locked image carousel that pairs with the retrieved passage — same persona, same product framing, refreshed inside the freshness window the chunk-retrieval substrate scans against.
Start free with ppl.studio10 free photos · no credit card required
Founder of ppl.studio. Building AI tools for product marketing teams who need visual content at scale without the production overhead.