Automating Knowledge Graphs from MDX

Building the Knowledge Graph system for campa.dev, the goal wasn’t to generate JSON-LD. It was to prevent content and structured data from diverging. The data already existed in the frontmatter, validated by Zod 4; the problem was connecting it to Schema.org without duplicating what was already there.

A transformer solved it.

src/utils/schema.ts

import type { CollectionEntry } from "astro:content";

export function generateArticleSchema(entry: CollectionEntry<"blog">) {
  return {
    "@context": "https://schema.org",
    "@type": "TechArticle",
    headline: entry.data.title,
    description: entry.data.aiSummary ?? entry.data.description,
    datePublished: entry.data.pubDate.toISOString(),
    dateModified:
      entry.data.updatedDate?.toISOString() ?? entry.data.pubDate.toISOString(),
    author: { "@type": "Person", name: entry.data.author },
    mentions:
      entry.data.geoEntities?.map((name) => ({
        "@type": "Thing",
        name,
      })) ?? [],
    keywords: entry.data.tags.join(", "),
  };
}

geoEntities[] from the frontmatter becomes mentions: [{ @type: Thing, name }]. mentions gives explicit semantic signals to crawlers and entity-based retrieval systems. The transformer doesn’t validate anything; Zod already did that at build time. If the schema passes, the JSON-LD is structurally correct.

To inject it in the layout:

src/layouts/BlogPost.astro

schema = generateArticleSchema(entry); ---
<script type="application/ld+json" set:html={JSON.stringify(schema)} />

Result: Zero desynchronizations between frontmatter and JSON-LD. Building the transformer on day one of campa.dev is exactly why structured data has never needed “fixing”: the build generates it, Zod validates it, they can’t diverge.

Trade-off: every optional Zod schema field (like geoEntities) needs the ?. operator in the transformer; otherwise the build breaks with undefined. Not complex, but a maintenance point that grows with the schema.

When the Knowledge Graph derives from a typed schema, structured data stops being maintenance and becomes compilation.

References#

…g CO₂