ORIGINAL RESEARCH
The HTML Myth in AI Visibility
I had a clean theory: enterprise sites that score well on AI visibility must have cleaner HTML than the ones that don't. So I built a scanner and ran it across 492 enterprise sites I'd already benchmarked. The correlation was -0.007.
492
enterprise sites tested
-0.007
correlation (Pearson r)
0.1pt
top vs bottom A11y delta
77.0
average A11y floor
THE HEADLINE
Clean HTML does not predict AI citation.
Top quartile by AI visibility score (n=123): accessibility score 77.3.
Bottom quartile by AI visibility score (n=123): accessibility score 77.1.
A 0.1-point gap. Out of 100. Across 246 sites. The Pearson correlation across the full 492-site sample was −0.007 — statistician for “you imagined it.”
Klaviyo, Monday, Braze, Zendesk, Yotpo all sit at 78-91 on accessibility. Some are heavily AI-cited. Some are completely invisible. The HTML doesn't predict it either way.
The lazy AI-SEO advice currently doing the rounds — “fix your semantic HTML and the LLMs will find you” — does not survive contact with 492 enterprise sites.
THE DATA
Top quartile vs bottom quartile
Top quartile by AI score (n=123)
Bottom quartile by AI score (n=123)
What this means:
AI visibility varies wildly across enterprise sites — from 2 to 97 out of 100. Accessibility does not. It clusters tightly around 77/100. Almost every modern enterprise site already has competent semantic HTML. The variance in AI citation is being driven by something the front-end markup cannot see.
BOTH SIDES OF THE LINE
Sites with great accessibility live on both sides of the AI line
If clean HTML predicted AI citation, the high-accessibility names would cluster on one side. They don't. Pick any 5 from the top end of the A11y scale and you find some heavily cited and others completely invisible.
AI-CITED · HIGH A11Y
- ZapierA11y 100
- ZoomInfoA11y 99
- ForstersA11y 96
- AdyenA11y 95
- AvanadeA11y 95
AI-INVISIBLE · HIGH A11Y
- KlaviyoA11y 78 · AI 2/100
- Monday.comA11y 73 · AI 2/100
- BrazeA11y 78 · AI 2/100
- SegmentA11y 87 · AI 2/100
- ZendeskA11y 91 · AI 2/100
PER-AXIS AVERAGES
Where enterprise sites still fall short
These are real accessibility findings — even though none of them predict AI citation, several still represent failures of basic inclusive design across hundreds of enterprise sites.
Skip-to-content links: 1.6/5
Most enterprise sites are missing them entirely. This is the clearest accessibility failure across the dataset — and the one with the lowest fix cost. Worth noting: it still does not change your AI visibility.
KEY INSIGHTS
What the data tells us
The accessibility floor is high — and uniform
Across 492 enterprise sites the standard deviation on accessibility was just 13.2 points (mean 77.0). Almost every modern enterprise site has competent markup. Most of the differentiation people obsess over does not exist at the homepage HTML level.
AI visibility variance is enormous — and unrelated
AI visibility ranged from 2 to 97 out of 100 (sd 28.4). The two metrics move independently. Whatever drives citation is not visible in the front-end source code.
Named cases on both sides break the narrative
Klaviyo, Monday, Braze, Segment and Zendesk all sit between 73 and 91 on accessibility while scoring 2/100 on AI visibility. Forsters and Adyen sit at 95-96 on accessibility AND score 96-97 on AI. Same markup quality, opposite citation outcomes.
Per-axis weaknesses are real but unrelated to AI
Skip-to-content links (1.6/5), alt text (12.6/20), and landmark structure (10.7/15) are genuinely weak across the enterprise sample. Worth fixing for human users. Not worth doing in the name of AI visibility.
If it isn't the HTML, what is it?
The next study tests four candidate predictors that operate outside the markup: Wikipedia presence, third-party press mentions, schema.org coverage, and domain age. Working hypothesis: external citation graph and entity authority dominate. Findings to follow after the May benchmark.
CITE THIS RESEARCH
Stats you can use
All stats from this study. Link to this page as your source.
−0.007
Pearson correlation between AI visibility and accessibility across 492 enterprise sites
0.1pt
accessibility gap between top-quartile and bottom-quartile AI-cited sites (n=246)
77.3
top-quartile AI sites' accessibility score (out of 100)
77.1
bottom-quartile AI sites' accessibility score (out of 100)
77.0
average accessibility score across all 492 enterprise sites — the floor
1.6/5
average skip-to-content link score — the weakest accessibility axis
12.6/20
average alt-text coverage — consistent gap across all sectors
492 / 524
valid scans from 524 unique domains in the registry
METHODOLOGY
How we ran this study
Sample
524 unique enterprise domains drawn from the GTM Signal Studio AI Visibility Registry — a unified canonical list pulled from three editions of the Enterprise Benchmark (March, April, May 2026) plus the UK Law Firms 2026 spinoff. 492 returned valid HTML and were included in the correlation. 32 returned errors or empty bodies (typically heavy SPAs).
Accessibility scan
Custom Python scanner running BeautifulSoup over the homepage HTML. Scored 9 axes for a total of 100: lang attribute (5), document title (5), heading hierarchy (20), alt text coverage (20), ARIA landmarks (15), link text quality (15), form labels (10), skip-to-content link (5), and button accessible name (5). Scoring is consistent with how axe and Lighthouse evaluate these signals.
AI visibility score
Pulled from the existing benchmark dataset. Each site was already scored 0-100 across Citation Presence, Entity Recognition, Content Structure, and Citation Breadth using Scanner v2.0 (multi-API: OpenAI, Gemini, Brave, Tavily). Most recent scan per domain was used.
Correlation
Pearson r between AI total score and accessibility total score across 492 paired observations. Computed in pure Python — no statistical package, no preprocessing, no outlier removal. Quartile means computed by sorting on AI score and taking the top and bottom 25%.
Limitations
Homepage HTML only — JS-rendered SPAs may underscore on accessibility for the wrong reason. The scanner does not run Playwright. We are measuring static markup quality, not the full rendered DOM. The correlation conclusion is robust to this caveat because the effect we're testing is so close to zero (−0.007) that any plausible JS-rendering correction cannot move it into significance.
If it isn't the HTML, what is it?
The AI Visibility Audit tests the things that actually move the needle — citation presence, entity recognition, third-party authority — not a markup checklist anyone could run themselves.