:::contentbit

Blog·

Why LLMs write broken Markdown (and how to make it impossible)

LLMs write Markdown that looks right and breaks in production. Prompting alone never fixes it. A validation contract does.

valid · 0 diagnostics4 blocks702 words

@contentbit/react + styled pack

Ask a model for an article and you get beautiful Markdown back: headings in the right order, lists that parse, everything formatted with total confidence. That confidence is the problem. The output looks publishable, so it ships, and the failures only surface in production, usually as a component that does not exist or a table row that quietly eats the layout.

The failure modes are always the same

Run an LLM content pipeline for a week and you will meet all of these. The model invents components: ask for "use our Callout component" and sooner or later you get <Warning>, <Note>, or <Callout2>. Props drift, so the component takes type="warning" and the model writes variant="warn". Nothing crashes, the styling just silently disappears. Tables come back with two cells in one row and three in the next, and Markdown shrugs and renders them anyway.

The slowest poison is the fourth one. Someone refactors a component, nobody updates the system prompt, and every article generated after that day is subtly wrong.

These are interface failures more than model failures. Free-form markup is an interface with no contract, and the model is just filling the silence.

Hope is not a pipeline

Prompt and hopeValidated contract
Output formatWhatever the model felt likeConstrained block syntax
Bad outputShips, breaks in productionRejected before render
Error feedbackA user screenshotfile:line:col plus a fix hint
Prompt accuracyDrifts from the codeGenerated from the schema

The right column is what Content Blocks does. Authors, human or model, write plain Markdown with directive blocks:

:::callout{type="warning" title="The rim is sacred"}
Never flatten the outer 2cm of the dough.
:::

Every block has a schema. Validation runs before anything renders, and a violation produces a diagnostic a machine can act on:

article.md:12:1 error CB_PROPS_INVALID
:::callout props invalid: type must be one of note|tip|warning|important|tldr.
hint: Did you mean type="warning"?

Close the loop

Here is the part that matters for generated content: the registry that validates the output also writes the model's instructions. Schema, docs, and prompt cannot drift apart, because they are one artifact.

  1. 1

    Generate the authoring guide from your registry with contentbit instructions --audience llm and put it in the system prompt.

  2. 2

    Let the model write plain Markdown with blocks. No JSX, no HTML, nothing executable.

  3. 3

    Validate the output. contentbit validate exits 1 with file:line:col diagnostics.

  4. 4

    Feed the diagnostics back to the model and let it repair its own output. Loop until clean.

  5. 5

    Render anywhere: React, static HTML, or plain Markdown for email and search indexes.

Step four carries more weight than it looks. A diagnostic like CB_ROW_COLUMNS ... Found 2, expected 3 with a hint attached is something a model fixes correctly on the first retry, most of the time. You stop reviewing generated markup and start reviewing what the content actually says.

Questions we keep getting

This article is itself a Content Blocks document. Hit Source above to see the Markdown it was written in, or Plain Markdown to see the fallback rendering. The playground validates as you type if you want to break things yourself.