What is Retrievable chunk?
A retrievable chunk is a passage that simultaneously passes the five mid-2026 retrievability properties: ~600–900 characters; bounded by an h2 or h3; one main claim, not three in parallel; a self-anchoring opening sentence that restates the page-level context without requiring the prior chunk; and a closing sentence that lands on a complete thought rather than trailing into the next idea. Pages whose chunks pass all five retrieve at materially higher rates than pages whose chunks pass three or four — the substrate's ranking gradient is non-linear across the property set. The chunk audit table scores every chunk against the five properties; the rewrite priority is the chunks failing the most properties on the highest-traffic pages, in that order.
How it relates to AI UGC
Retrievable text chunks pair best with retrievable image chunks — the persona-locked AI UGC inside the heading-bounded section is what fills the carousel slot beside the retrieved passage. The text-retrievability properties and the image-retrievability properties (persona stability, ImageObject schema, freshness window) compose; pages winning both win the full citation slot. ppl.studio supplies the image layer the chunk audit makes legible.
Key statistics
- Pages whose chunks pass all five retrievability properties out-cite pages whose chunks pass three or four by 1.8–2.5× on the same priority query set (retrievability cohort, 2026).
- Most under-citing mid-2026 priority pages fail on two specific properties — wall-of-prose chunk size and context-stripped opening sentences — both of which are mechanical to fix in a chunk-rewrite sprint (failure-mode audits, 2026).
- The five-property chunk audit table identifies the highest-leverage rewrites in roughly the same proportion across categories: 35% wall-of-prose, 28% context-stripped openings, 18% multi-claim parallel chunks, 12% mid-thought endings, 7% bullet-fragment chunks (chunk-failure distributions, 2026).