Site scanner
Crawl what an agent can read today: pages, entities, existing schema, and the gaps between them.
Features
SchemaX is the whole pipeline. Scan, propose, review, deploy, verify. Built so a single product person can ship, and a hundred-person team can govern.
Crawl what an agent can read today: pages, entities, existing schema, and the gaps between them.
Visual editor for every field of every entity. Diff before you ship.
Suggestions carry the page snippet they're drawn from. Grounded facts publish automatically; uncertain ones wait for you.
A single script tag publishes approved schema without a rebuild or CMS migration.
Published schema is checked against the live page so stale or unsupported fields are flagged.
When content changes and schema no longer matches, you find out before the mismatch compounds.
Every change is a version. Roll back in one click. Audit trail forever.
See page status, approvals, published versions, and drift alerts from one operational view.
Use the managed injector, GTM export, or developer-ready schema output when your stack needs it.
Why grounded schema
These examples show the gap between page content and machine-readable markup: values stated incorrectly, visible facts omitted, and the grounded schema SchemaX is built to publish instead. No invented prices. No borrowed facts.
BBC Good Food/bbcgoodfood.com/recipes/easy-pancakes
The markup snapshot restates salt as sodium, a different nutrient, with a 1000x unit error. The visible rating is not declared.
{
"@type": "Recipe",
"name": "Easy pancakes",
"nutrition": {
"@type": "NutritionInformation",
"sodiumContent": "0.1 milligram of sodium"
}
// no aggregateRating
// no fiberContent / servingSize
}We keep the salt figure verbatim, drop the fabricated sodium, and add the rating that was visible all along.
{
"@type": "Recipe",
"name": "Easy pancakes",
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "4.1",
"ratingCount": "876"
},
"nutrition": {
"@type": "NutritionInformation",
"fiberContent": "0 g",
"servingSize": "per pancake"
}
}On the page
What you’d gain
Recover eligibility signals for the rating that was on the page but absent from the machine layer.
Open Food Facts / Nutella/world.openfoodfacts.org/product/.../nutella-ferrero
The JSON-LD snapshot describes the website itself. Nothing tells an agent what the product is, despite a full fact sheet on screen.
{
"@type": "WebSite",
"name": "Open Food Facts",
"url": "https://world.openfoodfacts.org"
// describes the site, not Nutella
// no Product node at all
// no gtin / brand / nutrition
}We build the Product that was missing, with every field grounded against the visible fact sheet.
{
"@type": "Product",
"name": "Nutella",
"gtin13": "3017620422003",
"brand": { "@type": "Brand", "name": "Nutella" },
"manufacturer": "Ferrero France Commerciale",
"additionalProperty": [
{ "name": "Nutri-Score", "value": "E" },
{ "name": "NOVA group", "value": "4" }
]
}On the page
What you’d gain
Make the product itself machine-readable, not just the website around it.
The Guardian/theguardian.com/.../brexit-bus-economy-vote
The Guardian's markup is accurate, but it types an opinion piece as a generic NewsArticle and omits the description, section, and author's role that are all right there.
{
"@type": "NewsArticle",
"headline": "The Brexit bus...",
"author": { "@type": "Person", "name": "..." },
"datePublished": "2024-..."
// no description (standfirst)
// no articleSection: Opinion
// no author jobTitle / inLanguage
}We promote it to the specific type, fold the standfirst into a description, and add the section, language, and the author's stated role.
{
"@type": "OpinionNewsArticle",
"headline": "The Brexit bus...",
"description": "<standfirst, verbatim>",
"articleSection": "Opinion",
"inLanguage": "en-GB",
"author": {
"@type": "Person", "name": "...",
"jobTitle": "Guardian columnist"
}
}On the page
What you’d gain
Let an agent read the page as opinion, in English, by a named columnist, not as anonymous generic news.
Cloudflare CDN/cloudflare.com/products/cdn/
The markup snapshot contains Organization, WebSite, and WebPage nodes. Nothing describes the CDN service. A naive generator could invent plan tiers that do not appear on this page.
{
"@type": "WebPage",
"name": "Cloudflare CDN",
"isPartOf": { "@type": "WebSite", "...": "..." }
// no Service node for the CDN itself
// no serviceType / provider / areaServed
// tempting autofill: hasOfferCatalog
// [Free, Pro, Business, Enterprise] - not on page
}We emit the Service that anchors the page, keep only grounded properties, and refuse the price tiers that are not on it.
{
"@type": "Service",
"serviceType": "Content Delivery Network",
"provider": { "@type": "Organization", "name": "Cloudflare" },
"areaServed": "Worldwide",
"offers": {
"@type": "Offer", "price": "0",
"url": "https://dash.cloudflare.com/sign-up"
}
// no invented tier catalog
}On the page
What you’d gain
Make the service machine-readable without publishing a price list the page never claimed.
Policy-first
Generative tools that just push raw schema make a mess. SchemaX runs deterministic extraction first, auto-publishes only what it can ground, and surfaces the uncertain or sensitive claims for you. AI commits the safe 80%; you decide the rest, and you can pause everything.
{
"@type": "Product",
"name": "Wireless Lamp",
- "price": "$49",
+ "price": "$49.00 USD",
+ "availability": "InStock",
+ "priceCurrency": "USD"
}
// grounded in /products/lamp · no invented valuesEinmal eingestellt, von dir freigegeben. Jeder Agent auf jeder Plattform liest dieselben verifizierten Fakten: in deinen Seiten, in deinem /.well-known/ucp-Manifest und über die Agent-App deiner Website.