Automating Knowledge Graphs from MDX
Building the Knowledge Graph system for campa.dev, the goal wasn't to generate JSON-LD. It was to prevent content and structured data from diverging. The data already existed in the frontmatter, validated by Zod 4; the problem was connecting it to Schema.org without duplicating what was already there.
A transformer solved it.
import type { CollectionEntry } from "astro:content";
export function generateArticleSchema(entry: CollectionEntry<"blog">) {
return {
"@context": "https://schema.org",
"@type": "TechArticle",
headline: entry.data.title,
description: entry.data.aiSummary ?? entry.data.description,
datePublished: entry.data.pubDate.toISOString(),
dateModified:
entry.data.updatedDate?.toISOString() ?? entry.data.pubDate.toISOString(),
author: { "@type": "Person", name: entry.data.author },
mentions:
entry.data.geoEntities?.map((name) => ({
"@type": "Thing",
name,
})) ?? [],
keywords: entry.data.tags.join(", "),
};
}geoEntities[] from the frontmatter becomes mentions: [{ @type: Thing, name }]. mentions gives explicit semantic signals to crawlers and entity-based retrieval systems. The transformer doesn't validate anything; Zod already did that at build time. If the schema passes, the JSON-LD is structurally correct.
To inject it in the layout:
schema = generateArticleSchema(entry); ---
<script type="application/ld+json" set:html={JSON.stringify(schema)} />Result: Zero desynchronizations between frontmatter and JSON-LD. Building the transformer on day one of campa.dev is exactly why structured data has never needed "fixing": the build generates it, Zod validates it, they can't diverge.
Trade-off: every optional Zod schema field (like geoEntities) needs the ?. operator in the transformer; otherwise the build breaks with undefined. Not complex, but a maintenance point that grows with the schema.
When the Knowledge Graph derives from a typed schema, structured data stops being maintenance and becomes compilation.