GEO Score Specification v1.0 — Open Standard for AI Visibility Scoring

Section 1

Overview

1.1 What GEO Score Measures

The GEO (Generative Engine Optimization) Score quantifies a domain's AI citation probability — the likelihood that the domain's content will appear as a source, reference, or citation in a response generated by a major AI engine.

A high GEO Score does not guarantee citations. It reflects the degree to which a domain's content is structured, dense, and formatted in ways that AI engines demonstrably prefer when constructing factual responses.

GEO Score is distinct from traditional SEO metrics. A domain may rank #1 on Google while scoring poorly on GEO, and vice versa. The signals are complementary but different in nature: SEO rewards link authority; GEO rewards answer-ability.

1.2 Score Range & Calibration

Score Range	Tier	Interpretation
85 – 100	Platinum	High citation probability — content is deeply structured, factual, and authority-dense
75 – 84	Gold	Strong AI visibility — most signals present; may lack depth in one or two dimensions
60 – 74	Silver	Moderate visibility — notable gaps in FAQ structure, factual density, or authority signals
0 – 59	Bronze / Unrated	Low citation probability — content insufficiently structured for AI engine preference

Scores are computed from multi-page content crawls. Up to 6 pages per domain are fetched and scored independently; the final domain GEO Score is the weighted mean of per-page scores, with the homepage weighted 2× relative to inner pages.

Calibration target: a score of 75 should correspond to a domain that appears in roughly one-quarter of relevant AI engine responses, based on monthly sampling across a panel of 500+ seed queries.

1.3 Covered AI Engines

Version 1.0 covers the following AI engines. Each receives a derived per-engine proxy score in addition to the composite GEO Score.

ChatGPT

OpenAI GPT-4 / GPT-4o family. Favors FAQ structure, definitional clarity, and authority cues.

FAQ-first

Perplexity

Citation-native engine. Strongly rewards statistical density and factual specificity.

Citation-native

Claude

Anthropic. Prefers depth, readability, and well-structured long-form content.

Depth-first

Gemini

Google DeepMind. Strong preference for structured data (JSON-LD), entities, and schema markup.

Schema-aware

Grok

xAI. Rewards timeliness signals, statistics, and trending-angle content.

Recency-weighted

Section 2

Scoring Signals

The composite GEO Score is the sum of nine independent signal scores, each contributing a defined maximum number of points. The total maximum is 100 points.

2.1 Signal Weights Summary

#	Signal	Max Points	Weight	Visual
S1	FAQ / Q&A Structure	20	20%	20
S2	Numbered Lists & Steps	15	15%	15
S3	Statistics & Numbers	15	15%	15
S4	Header Structure	10	10%	10
S5	Definition Language	10	10%	10
S6	Content Depth (Word Count)	10	10%	10
S7	Keyword Signal (neutral)	10	10%	10
S8	Authoritative Language	10	10%	10
S9	Factual Density	10	10%	10

2.2 S1 — FAQ / Q&A Structure (0–20)

AI engines are optimized to answer questions. Content that explicitly contains question–answer pairs is more likely to be retrieved and cited.

Signal	Points
Each question sentence (contains `?`)	+2 per occurrence (max 20)
Explicit FAQ section heading (`faq`, `frequently asked`, `questions and answers`)	+5 per match

2.3 S2 — Numbered Lists & Steps (0–15)

Step-by-step instructional content is highly citeable — engines return it verbatim for "how to" queries.

Signal	Points
HTML ordered list tag (`<ol>`, `<li>`)	+1 per occurrence
Markdown-style numbered list (`1.`, `2.`…)	+2 per list item
How-to phrasing (`how to`, `step 1`, `step-by-step`)	+3 per match

2.4 S3 — Statistics & Numbers (0–15)

Quantified claims are the primary signal for Perplexity citations and a strong secondary signal for all engines.

Signal	Points
Percentage figures (`42%`, `0.5%`)	+3 per match
Year references (`2020`–`2029`, `199x`)	+1 per match
Large numbers (4+ digits: `1000`, `50000`)	+1 per match

2.5 S4 — Header Structure (0–10)

Well-organized content with clear H2/H3 hierarchy signals that the page is navigable and scannable — a proxy for structured knowledge.

Signal	Points
H2 headings (`##` or `<h2>`)	+2 per heading
H3 headings (`###` or `<h3>`)	+1 per heading

2.6 S5 — Definition Language (0–10)

Explicit definitional constructs signal that the content is explaining concepts — a format AI engines actively retrieve for "what is" queries.

Trigger Phrase	Points
`is a` / `is an`	+2 per match
`refers to`	+2 per match
`defined as`	+2 per match
`means that`	+2 per match
`is defined`	+2 per match
`also known as`	+2 per match

2.7 S6 — Content Depth / Word Count (0–10)

Shallow content rarely surfaces in AI responses. Depth is a necessary (not sufficient) condition for citation.

Word Count	Score
≥ 2,000 words	10
1,500–1,999	8
1,200–1,499	6
800–1,199	4
500–799	2
< 500	0

2.8 S8 — Authoritative Language (0–10)

References to research, expert consensus, and institutional sources increase citation probability by signaling verifiability.

Signal Phrase	Points
`research shows`, `studies show`	+2 each
`according to`, `experts say`, `data shows`	+2 each
`survey found`, `published in`	+2 each
Named institutions (`Harvard`, `Stanford`, `MIT`)	+2 each
`peer-reviewed`	+2

2.9 S9 — Factual Density (0–10)

Factual density measures the proportion of sentences containing citable claims — percentages, monetary figures, comparative statements, or attributed findings.

Citable Sentence Patterns	Examples
Contains a percentage	"Adoption grew 43% year over year"
Contains a monetary figure	"Average deal size is $12,400"
Contains a large number (4+ digits)	"Over 5,000 customers"
Comparative language	"faster than legacy approaches"
Attributed finding	"According to Gartner…"

Score = min(10, round(citableSentences / 5 × 10))

Section 3

Per-Engine Methodology

In addition to the composite GEO Score, per-engine proxy scores are computed from the same signal breakdown using engine-specific weightings. Each per-engine score is on the 0–100 scale.

3.1 ChatGPT Score

Citation profile: FAQ-first, definition-heavy, authority-conscious.

Formula
chatgpt = (S1_faq × 0.35) + (S5_definitions × 0.25) + (S8_authority × 0.25) + (S4_headers × 0.15)

Known biases: ChatGPT strongly favors domains that explicitly answer questions. A page without any FAQ structure or definition language will likely score below 30 on the ChatGPT proxy regardless of overall content quality.

3.2 Perplexity Score

Citation profile: Citation-native — requires hard numbers, statistics, and verifiable claims.

Formula
perplexity = (S3_statistics × 0.40) + (factualDensityNorm × 0.30) + (S2_steps × 0.15) + (S6_wordCount × 0.15)

Known biases: Perplexity disproportionately favors recent content. A page with 2023 statistics will often outrank a qualitatively stronger page with no date signals. Content freshness decay is tracked but not yet incorporated in v1.0 (planned for v1.1).

3.3 Claude Score

Citation profile: Depth-first — prefers comprehensive, well-structured long-form content.

Formula
claude = (S6_wordCount × 0.30) + (readabilityNorm × 0.30) + (S4_headers × 0.20) + (S5_definitions × 0.20)

Readability is approximated using a Flesch Reading Ease analog: sentence length variance, average syllable count proxy (avg word length), and paragraph density. The readability component is normalized to 0–30 before weighting.

Known biases: Claude's training emphasizes nuanced, cited reasoning. Pages that assert claims without explanation or context score poorly. Depth is the single strongest lever for improving Claude proxy scores.

3.4 Gemini Score

Citation profile: Schema-aware — structured data (JSON-LD) provides a significant boost.

Formula
gemini = (jsonLdBonus × 0.30) + (S3_statistics × 0.25) + (S4_headers × 0.25) + (S8_authority × 0.20)
  jsonLdBonus = 20 if page contains valid JSON-LD, else 0

Known biases: Gemini's integration with Google Search means it heavily weights Knowledge Graph signals and structured entity data. Domains without any JSON-LD schema markup face a 6-point ceiling disadvantage at the start.

3.5 Grok Score

Citation profile: Recency-weighted, statistics-driven, trending-angle content preferred.

Formula
grok = (S3_statistics × 0.40) + (S8_authority × 0.30) + (S1_faq × 0.30)

Known biases: Grok is trained with real-time X/Twitter data. Content that references trending discussions, recent events, or contains explicit date signals scores significantly higher. This bias is the primary factor not yet fully modeled in v1.0.

Confidence intervals: Per-engine scores are proxy estimates derived from content analysis. They are not based on direct query sampling of each AI engine. Correlation against sampled ground-truth citations is 0.73 (Pearson r) based on a validation set of 1,000 domains scored in Q1 2026.

Section 4

Vertical Classification

Every domain in the GEO Index is assigned a vertical. Verticals serve two purposes: (1) enable vertical-relative scoring and ranking, and (2) allow per-query category filtering in the public leaderboard.

4.1 Vertical Taxonomy

Version 1.0 recognizes 10 verticals:

Cybersecurity AI Tools SaaS FinTech DevTools LegalTech HealthTech Ecommerce MarTech HRTech

Domains that do not clearly match any vertical are assigned to SaaS as the default.

4.2 Classification Algorithm

Vertical classification uses a multi-signal scoring approach across five evidence layers:

Evidence Layer	Points per Match	Notes
Known domain list	+30 (one-time)	Pre-seeded list of 20–30 known domains per vertical
Page title match (primary keyword)	+10 to +15	Longer, more specific keywords score higher
Meta description match	+6 to +10	Second-highest weight; confirms homepage intent
Meta keywords tag	+4 to +8	Optional field; used when present
Body text (first 8,000 chars)	+2 to +5	Baseline signal; easily gamed but hard to fake at scale
TLD bonus	+5 to +25	e.g. `.ai` → +15 AITools, `.security` → +25 Cybersecurity

The vertical with the highest accumulated score wins. In the event of a tie, the TLD is used as a tiebreaker, followed by domain name substring matching.

4.3 Vertical-Relative Scoring

In addition to the absolute GEO Score, each domain receives a vertical rank and a global rank. Vertical rank reflects standing among peers in the same industry — a GEO Score of 65 may represent the top 10% within HRTech but only the 60th percentile within AITools.

Section 5

API Reference

5.1 Version Endpoint

Third-party implementations can check whether they are running the latest specification version:

GET https://traffi.app/api/geo-standard/version

Example Response
{
  "version": "1.0.0",
  "published": "2026-04-07",
  "status": "active",
  "changelog_url": "https://traffi.app/geo-standard#changelog",
  "spec_url": "https://traffi.app/geo-standard",
  "signals_count": 9,
  "engines_covered": ["ChatGPT", "Perplexity", "Claude", "Gemini", "Grok"],
  "verticals_count": 10,
  "license": "CC BY 4.0"
}

5.2 Domain Score Endpoint

Check the current GEO Score for any domain in the public index:

GET https://traffi.app/api/geo-index/score/:domain

5.3 Implementation Attribution

If you implement this standard in your own tool or product, you must include attribution per the CC BY 4.0 license. Acceptable attribution formats:

Required Attribution
GEO Score methodology by Traffi (traffi.app/geo-standard), CC BY 4.0

Section 6

End-to-End Scoring Example

Consider a fictional domain acmecrm.com with a homepage containing the following content characteristics:

Signal	Detected	Calculation	Score
S1 — FAQ	3 questions + 1 FAQ heading	`min(20, 3×2 + 1×5)`	11
S2 — Steps	4 numbered items + 1 "how to" phrase	`min(15, 4×2 + 1×3)`	11
S3 — Statistics	2 percentages + 3 years + 1 large number	`min(15, 2×3 + 3 + 1)`	10
S4 — Headers	3 H2 + 2 H3	`min(10, 3×2 + 2)`	8
S5 — Definitions	2 definitional phrases	`min(10, 2×2)`	4
S6 — Word Count	1,600 words	Tier: 1,500–1,999	8
S7 — Keyword	Topical presence	Domain-specific	6
S8 — Authority	2 authority phrases	`min(10, 2×2)`	4
S9 — Factual Density	3 citable sentences	`min(10, round(3/5 × 10))`	6

6.1 Composite Score

Calculation
GEO = 11 + 11 + 10 + 8 + 4 + 8 + 6 + 4 + 6 = 68 (Silver tier)

6.2 Per-Engine Proxy Scores

Engine	Formula Result	Score	Insight
ChatGPT	`(11×0.35 + 4×0.25 + 4×0.25 + 8×0.15) / 13.5 × 100`	51	Low FAQ/definition density hurts
Perplexity	`(10×0.40 + 9×0.30 + 11×0.15 + 8×0.15) / 13.25 × 100`	63	Stats carry this engine
Claude	`(8×0.30 + readability + 8×0.20 + 4×0.20) / 16 × 100`	58	Decent depth, needs definitions
Gemini	`(0×0.30 + 10×0.25 + 8×0.25 + 4×0.20) / 14.25 × 100`	42	Missing JSON-LD is costly
Grok	`(10×0.40 + 4×0.30 + 11×0.30) / 15 × 100`	60	Statistics + FAQ help

Actionable improvements: Add FAQ schema markup, include JSON-LD structured data, strengthen definition language ("CRM is defined as..."), add more citable statistics.

Section 7

Frequently Asked Questions

Why a structural proxy instead of live citations?

Live citation measurement requires querying every AI engine with thousands of prompts daily, observing which domains are cited, and maintaining ground-truth datasets. This costs approximately $5,000–15,000/month in API fees for a 10,000-domain index and introduces 24–48 hour latency.

The structural proxy approach analyzes content characteristics that correlate with citation probability (r = 0.73). It can score 10,000+ domains per day at a fraction of the cost, with results available in minutes.

The tradeoff is reduced precision on individual domains — the proxy is better at ranking (ordering domains by citation likelihood) than at predicting exact citation rates. We publish correlation data quarterly and track drift between proxy scores and sampled ground truth. If the correlation drops below 0.65, we will release a methodology update.

Why not include backlink authority?

Traditional link-based authority (Domain Authority, Domain Rating) measures web graph centrality — a signal that Google's search algorithm rewards heavily. AI engines, however, do not follow the same logic. They prefer content that answers the question well, not necessarily content from highly-linked domains.

Our validation data shows that backlink authority adds only 0.03 to prediction accuracy (Pearson r) when included alongside the 9 content signals. The marginal improvement doesn't justify the dependency on third-party link databases.

How often are scores updated?

Domains in the GEO Index are rescored on a rolling basis. Most domains are refreshed weekly. High-traffic domains in the leaderboard top 100 are refreshed daily.

Can I compute my own GEO Score?

Yes. The reference implementation contains the exact scoring functions. You can run them against any HTML or Markdown content to compute a GEO Score. See the Open Source section below.

Section 8

Open Source

The GEO Score Standard is published as an open specification. All files are available for download, forking, and implementation.

8.1 Specification Files

File	Description	Link
SPECIFICATION.md	Full methodology in Markdown — forkable, implementable	View
CHANGELOG.md	Version history and planned changes	View
CONTRIBUTING.md	How to propose changes, join the Working Group	View
LICENSE	CC BY 4.0 license text	View
reference/scoring.js	Reference implementation (Node.js) — all 9 signal scoring functions	View

8.2 Using the Reference Implementation

Node.js Example
const { scoreContent, computeEngineScores } = require('./scoring');

const result = scoreContent(`
  ## What Is GEO Optimization?
  GEO is defined as the practice of structuring content
  for AI citation probability. According to research,
  43% of information queries now trigger AI answers.
`);

console.log(result.score);       // 72
console.log(result.breakdown);   // { faq: 14, steps: 0, ... }

const engines = computeEngineScores(result.breakdown);
console.log(engines.chatgpt);    // 68
console.log(engines.perplexity); // 54

Contribute or fork: The specification is designed to be implemented by any SEO tool, analytics platform, or research team. If you build something with it, let us know — we'll list integrations here.

Section 9

GEO Standard Working Group

The GEO Standard is maintained by a working group of SEO practitioners, AI researchers, and content strategists. Working group members receive early access to methodology updates and are listed as contributors to the specification.

Traffi Research Team

Specification authors & maintainers

SEO Practitioner Seats

3 open seats — apply below

AI Researcher Seats

2 open seats — apply below

Tool Builder Seats

2 open seats — for teams implementing the standard

Apply to join the working group: We're accepting applications from senior SEO practitioners, AI/NLP researchers, and teams building GEO tooling. Working group members get early access to v1.1 changes (planned for Q3 2026). Apply by emailing standard@traffi.app with subject line Working Group Application.

9.1 Governance

Version increments follow semantic versioning:

Patch (1.0.x): Clarifications, typo fixes, no methodology changes
Minor (1.x.0): New signals added, engine biases updated — backward compatible
Major (x.0.0): Breaking changes to the scoring formula — requires migration guide

All changes are ratified by the working group before release. Breaking changes require a 30-day comment period.

Section 10

Changelog

v1.0.0 — Initial Release

Published 2026-04-07

First public release of the GEO Score Standard. Covers 9 scoring signals, 5 AI engines (ChatGPT, Perplexity, Claude, Gemini, Grok), and 10 verticals. Establishes CC BY 4.0 licensing. Baseline correlation with sampled citations: r=0.73 on 1,000-domain validation set.

v1.1.0 — Planned

Target Q3 2026

Planned additions: content freshness decay signal (S10), Grok recency weight refinement, llms.txt presence signal, and answer-format match rate. Working group review in progress.

License

The GEO Score Standard is published under Creative Commons Attribution 4.0 International (CC BY 4.0).

You are free to: share (copy and redistribute in any medium or format), adapt (remix, transform, and build upon the material for any purpose, including commercially) — provided you give appropriate credit to Traffi (traffi.app/geo-standard), provide a link to the license, and indicate if changes were made.

https://creativecommons.org/licenses/by/4.0/