# GEO Score Specification v1.0.0

> **Status:** Active
> **Published:** 2026-04-07
> **License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
> **Canonical URL:** [traffi.app/geo-standard](https://traffi.app/geo-standard)

## Abstract

This document defines the **GEO Score Standard** — an open, versioned methodology for measuring the probability that a given web domain will be cited in responses from AI language model engines. The specification covers scoring signals, per-engine weighting, vertical classification, and calibration methodology. Anyone may implement this standard under the terms of CC BY 4.0 (free to use, must attribute Traffi).

---

## 1. Overview

### 1.1 What GEO Score Measures

The GEO (Generative Engine Optimization) Score quantifies a domain's **AI citation probability** — the likelihood that the domain's content will appear as a source, reference, or citation in a response generated by a major AI engine.

A high GEO Score does not guarantee citations. It reflects the degree to which a domain's content is structured, dense, and formatted in ways that AI engines demonstrably prefer when constructing factual responses.

GEO Score is distinct from traditional SEO metrics. A domain may rank #1 on Google while scoring poorly on GEO, and vice versa. The signals are complementary but different in nature: SEO rewards link authority; GEO rewards answer-ability.

### 1.2 Score Range & Calibration

| Score Range | Tier | Interpretation |
|-------------|------|----------------|
| 85–100 | Platinum | High citation probability — content is deeply structured, factual, and authority-dense |
| 75–84 | Gold | Strong AI visibility — most signals present; may lack depth in one or two dimensions |
| 60–74 | Silver | Moderate visibility — notable gaps in FAQ structure, factual density, or authority signals |
| 0–59 | Bronze / Unrated | Low citation probability — content insufficiently structured for AI engine preference |

Scores are computed from multi-page content crawls. Up to 6 pages per domain are fetched and scored independently; the final domain GEO Score is the weighted mean of per-page scores, with the homepage weighted 2x relative to inner pages.

**Calibration target:** A score of 75 should correspond to a domain that appears in roughly one-quarter of relevant AI engine responses, based on monthly sampling across a panel of 500+ seed queries.

### 1.3 Covered AI Engines

Version 1.0 covers the following AI engines:

| Engine | Vendor | Citation Profile |
|--------|--------|-----------------|
| ChatGPT | OpenAI (GPT-4 / GPT-4o) | FAQ-first — favors FAQ structure, definitional clarity, and authority cues |
| Perplexity | Perplexity AI | Citation-native — strongly rewards statistical density and factual specificity |
| Claude | Anthropic | Depth-first — prefers comprehensive, well-structured long-form content |
| Gemini | Google DeepMind | Schema-aware — strong preference for structured data (JSON-LD), entities, and schema markup |
| Grok | xAI | Recency-weighted — rewards timeliness signals, statistics, and trending-angle content |

---

## 2. Scoring Signals

The composite GEO Score is the sum of nine independent signal scores, each contributing a defined maximum number of points. The total maximum is **100 points**.

### 2.1 Signal Weights Summary

| # | Signal | Max Points | Weight |
|---|--------|-----------|--------|
| S1 | FAQ / Q&A Structure | 20 | 20% |
| S2 | Numbered Lists & Steps | 15 | 15% |
| S3 | Statistics & Numbers | 15 | 15% |
| S4 | Header Structure | 10 | 10% |
| S5 | Definition Language | 10 | 10% |
| S6 | Content Depth (Word Count) | 10 | 10% |
| S7 | Keyword Signal (neutral) | 10 | 10% |
| S8 | Authoritative Language | 10 | 10% |
| S9 | Factual Density | 10 | 10% |

### 2.2 S1 — FAQ / Q&A Structure (0–20)

AI engines are optimized to answer questions. Content that explicitly contains question-answer pairs is more likely to be retrieved and cited.

| Signal | Points |
|--------|--------|
| Each question sentence (contains `?`) | +2 per occurrence (max 20) |
| Explicit FAQ section heading (`faq`, `frequently asked`, `questions and answers`) | +5 per match |

**Formula:**

```
faq_score = min(20, question_count * 2 + faq_heading_count * 5)
```

### 2.3 S2 — Numbered Lists & Steps (0–15)

Step-by-step instructional content is highly citeable — engines return it verbatim for "how to" queries.

| Signal | Points |
|--------|--------|
| HTML ordered list tag (`<ol>`, `<li>`) | +1 per occurrence |
| Markdown-style numbered list (`1.`, `2.`...) | +2 per list item |
| How-to phrasing (`how to`, `step 1`, `step-by-step`) | +3 per match |

**Formula:**

```
steps_score = min(15, ol_tags + md_numbered_items * 2 + howto_phrases * 3)
```

### 2.4 S3 — Statistics & Numbers (0–15)

Quantified claims are the primary signal for Perplexity citations and a strong secondary signal for all engines.

| Signal | Points |
|--------|--------|
| Percentage figures (`42%`, `0.5%`) | +3 per match |
| Year references (`2020`–`2029`, `199x`) | +1 per match |
| Large numbers (4+ digits: `1000`, `50000`) | +1 per match |

**Formula:**

```
stats_score = min(15, percentage_count * 3 + year_count + large_number_count)
```

### 2.5 S4 — Header Structure (0–10)

Well-organized content with clear H2/H3 hierarchy signals that the page is navigable and scannable — a proxy for structured knowledge.

| Signal | Points |
|--------|--------|
| H2 headings (`##` or `<h2>`) | +2 per heading |
| H3 headings (`###` or `<h3>`) | +1 per heading |

**Formula:**

```
headers_score = min(10, h2_count * 2 + h3_count)
```

### 2.6 S5 — Definition Language (0–10)

Explicit definitional constructs signal that the content is explaining concepts — a format AI engines actively retrieve for "what is" queries.

| Trigger Phrase | Points |
|---------------|--------|
| `is a` / `is an` | +2 per match |
| `refers to` | +2 per match |
| `defined as` | +2 per match |
| `means that` | +2 per match |
| `is defined` | +2 per match |
| `also known as` | +2 per match |

**Formula:**

```
definitions_score = min(10, definition_phrase_count * 2)
```

### 2.7 S6 — Content Depth / Word Count (0–10)

Shallow content rarely surfaces in AI responses. Depth is a necessary (not sufficient) condition for citation.

| Word Count | Score |
|-----------|-------|
| >= 2,000 words | 10 |
| 1,500–1,999 | 8 |
| 1,200–1,499 | 6 |
| 800–1,199 | 4 |
| 500–799 | 2 |
| < 500 | 0 |

### 2.8 S7 — Keyword Signal (0–10)

Neutral keyword presence score. This is a baseline signal indicating the content is topically relevant. It is intentionally not heavily weighted — AI engines use semantic understanding, not keyword density.

Score is assigned based on the number of natural keyword mentions in the content, normalized against content length. Exact implementation may vary across domain types.

### 2.9 S8 — Authoritative Language (0–10)

References to research, expert consensus, and institutional sources increase citation probability by signaling verifiability.

| Signal Phrase | Points |
|--------------|--------|
| `research shows`, `studies show` | +2 each |
| `according to`, `experts say`, `data shows` | +2 each |
| `survey found`, `published in` | +2 each |
| Named institutions (`Harvard`, `Stanford`, `MIT`) | +2 each |
| `peer-reviewed` | +2 |

**Formula:**

```
authority_score = min(10, authority_phrase_count * 2)
```

### 2.10 S9 — Factual Density (0–10)

Factual density measures the proportion of sentences containing citable claims — percentages, monetary figures, comparative statements, or attributed findings.

**Citable sentence patterns:**
- Contains a percentage: "Adoption grew **43%** year over year"
- Contains a monetary figure: "Average deal size is **$12,400**"
- Contains a large number (4+ digits): "Over **5,000** customers"
- Comparative language: "**faster than** legacy approaches"
- Attributed finding: "**According to** Gartner..."

**Formula:**

```
factual_density_score = min(10, round(citable_sentences / 5 * 10))
```

A page with 5+ citable sentences per page scores the maximum. Pages with fewer than 1 citable sentence score 0.

---

## 3. Composite Score Calculation

The composite GEO Score is the straight sum of all nine signal scores:

```
GEO_Score = S1 + S2 + S3 + S4 + S5 + S6 + S7 + S8 + S9
          = min(100, sum_of_all_signals)
```

**Multi-page aggregation:**

When scoring a domain (not a single page), up to 6 pages are crawled:

```
domain_score = (homepage_score * 2 + page2_score + page3_score + ... + page6_score) / (1 + page_count)
```

The homepage receives 2x weight. The divisor is `1 + number_of_inner_pages` (because the homepage counts as 2).

---

## 4. Per-Engine Proxy Scores

In addition to the composite GEO Score, per-engine proxy scores are computed from the same signal breakdown using engine-specific weightings. Each per-engine score is on the 0–100 scale.

### 4.1 ChatGPT Score

**Citation profile:** FAQ-first, definition-heavy, authority-conscious.

```
chatgpt = (S1_faq * 0.35) + (S5_definitions * 0.25) + (S8_authority * 0.25) + (S4_headers * 0.15)
```

Normalized to 0–100 by: `chatgpt_100 = chatgpt / max_possible * 100`

Where `max_possible = 20*0.35 + 10*0.25 + 10*0.25 + 10*0.15 = 13.5`, so `chatgpt_100 = chatgpt / 13.5 * 100`.

**Known biases:** ChatGPT strongly favors domains that explicitly answer questions. A page without any FAQ structure or definition language will likely score below 30 on the ChatGPT proxy regardless of overall content quality.

### 4.2 Perplexity Score

**Citation profile:** Citation-native — requires hard numbers, statistics, and verifiable claims.

```
perplexity = (S3_statistics * 0.40) + (S9_factual_density_norm * 0.30) + (S2_steps * 0.15) + (S6_wordCount * 0.15)
```

Where `S9_factual_density_norm` is the factual density score normalized to the same range as S3 (0–15).

**Known biases:** Perplexity disproportionately favors recent content. Content freshness decay is tracked but not yet incorporated in v1.0 (planned for v1.1).

### 4.3 Claude Score

**Citation profile:** Depth-first — prefers comprehensive, well-structured long-form content.

```
claude = (S6_wordCount * 0.30) + (readability_norm * 0.30) + (S4_headers * 0.20) + (S5_definitions * 0.20)
```

Readability is approximated using a Flesch Reading Ease analog: sentence length variance, average syllable count proxy (average word length), and paragraph density. The readability component is normalized to 0–30 before weighting.

**Known biases:** Claude's training emphasizes nuanced, cited reasoning. Pages that assert claims without explanation score poorly. Depth is the single strongest lever.

### 4.4 Gemini Score

**Citation profile:** Schema-aware — structured data (JSON-LD) provides a significant boost.

```
gemini = (jsonLdBonus * 0.30) + (S3_statistics * 0.25) + (S4_headers * 0.25) + (S8_authority * 0.20)
jsonLdBonus = 20 if page contains valid JSON-LD, else 0
```

**Known biases:** Gemini's integration with Google Search means it heavily weights Knowledge Graph signals and structured entity data. Domains without any JSON-LD schema markup face a 6-point ceiling disadvantage.

### 4.5 Grok Score

**Citation profile:** Recency-weighted, statistics-driven, trending-angle content preferred.

```
grok = (S3_statistics * 0.40) + (S8_authority * 0.30) + (S1_faq * 0.30)
```

**Known biases:** Grok is trained with real-time X/Twitter data. Content that references trending discussions or contains explicit date signals scores significantly higher.

### 4.6 Confidence Intervals

Per-engine scores are proxy estimates derived from content analysis. They are not based on direct query sampling of each AI engine. Correlation against sampled ground-truth citations is **r = 0.73** (Pearson) based on a validation set of 1,000 domains scored in Q1 2026.

---

## 5. Vertical Classification

### 5.1 Vertical Taxonomy

Version 1.0 recognizes **10 verticals**:

Cybersecurity, AI Tools, SaaS, FinTech, DevTools, LegalTech, HealthTech, Ecommerce, MarTech, HRTech.

Domains that do not clearly match any vertical are assigned to **SaaS** as the default.

### 5.2 Classification Algorithm

Vertical classification uses a multi-signal scoring approach across six evidence layers:

| Evidence Layer | Points per Match | Notes |
|---------------|-----------------|-------|
| Known domain list | +30 (one-time) | Pre-seeded list of 20–30 known domains per vertical |
| Page title match (primary keyword) | +10 to +15 | Longer, more specific keywords score higher |
| Meta description match | +6 to +10 | Second-highest weight; confirms homepage intent |
| Meta keywords tag | +4 to +8 | Optional field; used when present |
| Body text (first 8,000 chars) | +2 to +5 | Baseline signal; easily gamed but hard to fake at scale |
| TLD bonus | +5 to +25 | e.g. `.ai` -> +15 AI Tools, `.security` -> +25 Cybersecurity |

The vertical with the highest accumulated score wins. In the event of a tie, the TLD is used as a tiebreaker, followed by domain name substring matching.

### 5.3 Vertical-Relative Scoring

Each domain receives a **vertical rank** and a **global rank**. Vertical rank reflects standing among peers in the same industry — a GEO Score of 65 may represent the top 10% within HRTech but only the 60th percentile within AI Tools.

---

## 6. End-to-End Scoring Example

Consider a fictional domain `acmecrm.com` with a homepage containing:

**Content characteristics:**
- 3 question sentences containing `?` (6 pts toward S1)
- 1 explicit FAQ heading (5 pts toward S1) -> **S1 = 11**
- 4 Markdown-numbered items (8 pts) + 1 "how to" phrase (3 pts) -> **S2 = 11**
- 2 percentages (6 pts) + 3 year references (3 pts) + 1 large number (1 pt) -> **S3 = 10**
- 3 H2 headings (6 pts) + 2 H3 headings (2 pts) -> **S4 = 8**
- 2 definition phrases (4 pts) -> **S5 = 4**
- 1,600 words -> **S6 = 8**
- Topical keyword presence -> **S7 = 6**
- 2 authority phrases (4 pts) -> **S8 = 4**
- 3 citable sentences -> `min(10, round(3/5 * 10))` = 6 -> **S9 = 6**

**Composite Score:**

```
GEO = 11 + 11 + 10 + 8 + 4 + 8 + 6 + 4 + 6 = 68 (Silver tier)
```

**Per-engine proxy scores:**

```
ChatGPT    = (11*0.35 + 4*0.25 + 4*0.25 + 8*0.15) / 13.5 * 100 = 51
Perplexity = (10*0.40 + 9*0.30 + 11*0.15 + 8*0.15) / max * 100 = 63
Claude     = (8*0.30 + readability*0.30 + 8*0.20 + 4*0.20) / max * 100 = 58
Gemini     = (0*0.30 + 10*0.25 + 8*0.25 + 4*0.20) / max * 100  = 42 (no JSON-LD)
Grok       = (10*0.40 + 4*0.30 + 11*0.30) / max * 100           = 60
```

**Interpretation:** acmecrm.com has moderate AI visibility (Silver). Strongest on Perplexity and Grok due to statistical density. Weakest on Gemini due to missing JSON-LD. Actionable improvements: add FAQ schema, include JSON-LD markup, strengthen definition language.

---

## 7. FAQ

### Why a structural proxy instead of live citations?

Live citation measurement requires querying every AI engine with thousands of prompts daily, observing which domains are cited, and maintaining ground-truth datasets. This costs approximately $5,000–15,000/month in API fees for a 10,000-domain index and introduces 24–48 hour latency.

The structural proxy approach analyzes content characteristics that *correlate* with citation probability (r = 0.73). It can score 10,000+ domains per day at a fraction of the cost, with results available in minutes. The tradeoff is reduced precision on individual domains — the proxy is better at ranking (ordering domains by citation likelihood) than at predicting exact citation rates.

We publish correlation data quarterly and track drift between proxy scores and sampled ground truth. If the correlation drops below 0.65, we will release a methodology update.

### Why not include backlink authority?

Traditional link-based authority (Domain Authority, Domain Rating) measures web graph centrality — a signal that Google's search algorithm rewards heavily. AI engines, however, do not follow the same logic. They prefer *content that answers the question well*, not necessarily content from highly-linked domains.

Our validation data shows that backlink authority adds only 0.03 to prediction accuracy (Pearson r) when included alongside the 9 content signals. The marginal improvement doesn't justify the dependency on third-party link databases.

### How often are scores updated?

Domains in the GEO Index are rescored on a rolling basis. Most domains are refreshed weekly. High-traffic domains in the leaderboard top 100 are refreshed daily.

### Can I compute my own GEO Score?

Yes. The [reference implementation](reference/scoring.js) in this repository contains the exact scoring functions. You can run them against any HTML or Markdown content to compute a GEO Score.

---

## 8. API Reference

### 8.1 Version Endpoint

Third-party implementations can check whether they are running the latest specification version:

```
GET https://traffi.app/api/geo-standard/version
```

**Response:**

```json
{
  "version": "1.0.0",
  "published": "2026-04-07",
  "status": "active",
  "changelog_url": "https://traffi.app/geo-standard#changelog",
  "spec_url": "https://traffi.app/geo-standard",
  "signals_count": 9,
  "engines_covered": ["ChatGPT", "Perplexity", "Claude", "Gemini", "Grok"],
  "verticals_count": 10,
  "license": "CC BY 4.0",
  "license_url": "https://creativecommons.org/licenses/by/4.0/",
  "attribution": "GEO Score methodology by Traffi (traffi.app/geo-standard), CC BY 4.0"
}
```

### 8.2 Domain Score Endpoint

```
GET https://traffi.app/api/geo-index/score/:domain
```

Returns the current GEO Score for any domain in the public index.

### 8.3 Attribution

If you implement this standard in your own tool or product, you must include attribution per the CC BY 4.0 license:

```
GEO Score methodology by Traffi (traffi.app/geo-standard), CC BY 4.0
```

---

## 9. Governance

### 9.1 Working Group

The GEO Standard is maintained by a working group of SEO practitioners, AI researchers, and content strategists.

| Seat Type | Open Seats | Requirements |
|-----------|-----------|-------------|
| Traffi Research Team | — | Specification authors & maintainers |
| SEO Practitioner | 3 | 5+ years, AI search experience |
| AI Researcher | 2 | Published NLP/IR work |
| Tool Builder | 2 | Teams implementing the standard |

Apply: [standard@traffi.app](mailto:standard@traffi.app) — subject: "Working Group Application"

### 9.2 Versioning

| Change Type | Bump | Vote Required | Comment Period |
|------------|------|--------------|---------------|
| Clarification / typo | Patch (1.0.x) | None | None |
| New signal (additive) | Minor (1.x.0) | Simple majority | 14 days |
| Weight adjustment | Minor (1.x.0) | Simple majority | 14 days |
| Signal removal / formula rewrite | Major (x.0.0) | 2/3 majority | 30 days |

---

*GEO Score Standard v1.0.0 · Published by [Traffi](https://traffi.app) · [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)*
