Get SEEN: AI Visibility Framework

From ranked to SEEN.

1. The opener

A buyer in your category opens ChatGPT and asks for a recommendation. Maybe it’s what’s the best [thing your product does]. Maybe it’s who should I hire to do X. The AI gives them three names. Your competitor is on the list. You aren’t even mentioned, which is a different problem than ranking poorly. You can’t optimize your way up from invisible.

There’s a name for the failure mode now. Category blanking: the AI surfaced an answer in your category, in front of your buyer, and acted like your entity doesn’t exist.

What catches people off guard is that this happens to companies whose SEO is genuinely fine. Rankings hold. Google traffic looks normal. The technical audit from last quarter came back clean. None of that helps, because there’s no rank to climb on these surfaces. There’s an answer, and you’re either in it or you aren’t.

This framework came out of running the experiment on myself. While making my own entity AI-visible, I wrote The SEO-to-GEO Gap on SSRN (DOI 10.2139/ssrn.6476021) and built awesome-geo, a curated repo of AI-discoverable surfaces with crawler-access data and source notes. If the framework works, you can test it in 30 seconds: ask any AI chatbot who is Dee Kargaev? and see what comes back. That’s SEEN running in public, layer by layer, on one entity. The failure mode is otherwise silent, which is the whole problem: nothing alerts on category blanking, the competitor just quietly keeps winning the recommendation, and pipeline gets worse without an explanation.

The clearest single number on the shift comes from a 2025 Pew Research panel of 900 U.S. adults, tracking 68,879 Google searches across March 2025. When an AI summary appears at the top of the results, users click traditional search-result links in 8% of visits. They click links inside the AI summary itself in 1%. Without a summary, traditional-result click rate is 15% (Pew Research, 2025). The summary roughly halves traditional click-through, and its own outbound links barely register. Whatever value the user gets, they get inside the box. The page behind the answer is a reference now, not a destination.

Three things broke at once. The first reader of your content stopped being human (now it’s a retrieval system, a model, or an agent acting on someone’s behalf, with the human only ever reading the summary at the end). The page stopped being the unit of work, because AI stitches answers from many surfaces at once, and the unit is now the evidence corpus your entity exists across. And rank stopped meaning anything on these surfaces. What replaces it is messier: whether you appear in the answer at all, whether the description is right, and whether the citation lands on you instead of someone else with a confusable name.

SEO didn’t end. It stopped being the whole job.

SEEN is the framework for the part of the job SEO doesn’t cover. Four layers, four questions, one checklist. The rest of this essay walks through each layer, the failure mode that exposes a missing one, and what the work looks like. The category is full of screenshots and vibes right now; this is an audit you can run against your own entity this week instead.

2. The shift

The corpus AI builds its answer from is bigger than your URLs.

Category blanking: the AI surfaced an answer in your category, in front of your buyer, and acted like your entity does not exist.

The corpus is three categories. There’s the owned stuff: your site, your blog, your docs. There’s platform-hosted first-party content under your name (Substack posts, GitHub repos, LinkedIn long-posts, Medium pieces, podcast episodes you host, Wikipedia articles you maintain). And there’s the third-party trail: what other people wrote about you, where reviewers landed, which “best of” lists noticed. All three count. The discipline isn’t owning more URLs, it’s making the entity read as the same thing across every category.

That isn’t a hunch anymore; the data is clean enough to argue from. A 1+ million-citation 2026 study by Otterly.AI across ChatGPT, Perplexity, and Google AI Overviews found that community platforms (Reddit and Quora chief among them) capture about 52.5% of citations vs. 47.5% for brand domains (Otterly.AI, 2026). A separate 325,000-prompt 2026 study by ALM Corp confirmed Reddit at #1 across ChatGPT Search, Google AI Mode, and Perplexity, with LinkedIn at #2 (~11% citation share) (ALM Corp, 2026). Read that twice. The largest single source of AI citations is not your domain. Probably never will be. Treating the corpus as “stuff on URLs I own outright” is the most expensive shortcut available in the category.

Platform-hosted weight is load-bearing for a few stacking reasons. AI search retrieves from high-authority domains it already crawls deeply, and LinkedIn, GitHub, Substack, Medium, Crunchbase, and Wikipedia all sit on authority levels most owned sites won’t approach for years. Semrush’s 2025 study of 1,000 domains across ChatGPT, Perplexity, Gemini, and Google AI Overviews found Authority Score correlates with AI mentions at Pearson 0.65 / Spearman 0.57, with threshold effects beating linear scaling: a handful of links from genuinely high-authority domains outperforms many links from weak ones (Semrush, 2025). The same study found nofollow links carry near-identical AI weight to follow links, which breaks SEO orthodoxy: most platform profiles (LinkedIn, Crunchbase, half the directories) emit nofollow by default, and for AI retrieval that doesn’t seem to matter.

On top of discovery, there’s entity coherence (same bio, same canonical name, same sameAs graph across surfaces, so AI has one entity to retrieve and connect), and there’s the engagement-as-social-proof effect (subscribers, stars, listeners, follower counts on a high-authority surface aggregate as a notability signal that travels with the entity wherever AI quotes from it). All three reasons sit on the same surface, so the surface’s value compounds.

A lot of the old SEO mental model doesn’t survive the shift. Rank #1 in ChatGPT was never a real thing. Keyword density doesn’t apply to models that build representations rather than count tokens. The “one canonical landing page per intent” pattern breaks the moment an answer gets stitched from many sources. The page used to be the unit of distribution; now it’s the corpus.

What survives is the substrate, and it matters more, not less: crawlability, semantic HTML, schema markup that matches the page, backlinks from credible sources, original content worth reading. Google’s own AI optimization guide is blunt that there are no AI-only technical requirements (you don’t need to create new machine readable files, AI text files, markup, or Markdown), and the same SEO foundations carry through because generative AI features in Search are rooted in our core Search ranking and quality systems (Google, 2026). The lift comes from clarity, not from a secret feature flag. SEEN is the framework for adding the corpus-engineering layer on top of that substrate.

3. The framework — SEEN

SEEN makes an entity easy for search engines, answer engines, LLMs, and browser agents to find, read, verify, and recommend.

Four layers, four diagnostic questions. They behave multiplicatively, not additively: the weakest layer caps what the others can deliver. A clean answer with no reason to be trusted gets skipped. Strong proof AI can’t extract goes unread. Both, attached to the wrong entity, get credited to a competitor instead. And all three working, against an entity nobody else mentions, still never get recommended at all.

S — Structure

Question: Can AI extract the right answer from your content?

AI doesn’t read like a human. It retrieves passages, summarizes across them, and quotes the bits that look like answers. The smallest unit it consumes is a passage, not a page. So if your pricing lives behind a talk to sales form, your differentiators are buried 800 words into a marketing page, your facts only render in JavaScript, or your key numbers are inside screenshots, AI skips them or paraphrases them badly.

The concrete failure mode is the contact us pricing page. Every AI in the category has been asked how much does [your product] cost, every AI has answered with whatever competitor number was actually extractable, and you’ve never seen this happen, because there’s no analytics event for the AI gave up on your page.

The fix has two parts.

Answer-first layout. Every key page opens with a 30 to 60 word direct summary in the first couple of sentences, ideally right after a question-style heading. This isn’t a stylistic preference. A 2026 study by Kevin Indig of 1.2 million ChatGPT responses (with an 18,012-citation validation set) found that 44.2% of ChatGPT citations come from the first 30% of webpage content (ALM Corp / Indig, 2026). Indig calls the distribution a ski ramp. Where the answer sits on the page is load-bearing. The study is ChatGPT-specific but the direction is consistent with how RAG pipelines chunk and rank passages across engines.

Visible HTML facts, with schema that matches. Pricing, hours, scope, jurisdiction, what it is and who it’s for, all rendered in real HTML rather than images, PDFs, or JS-only state. Schema markup should match the visible content; no schema bloat, no contradictions. Pages should render usefully without JavaScript, or you ship SSR / prerender for bots. An Otterly accessibility audit on its 2026 sample found technical barriers (robots.txt blocks, CDN restrictions, JavaScript rendering issues) affecting roughly 73% of analyzed sites. Most owned surfaces fail this check, which makes it both the most common gap and the cheapest one to close.

Bonus: ship llms.txt and llms-full.txt. Google has explicitly said it does not use these files for AI Overviews or AI Mode. Other AI systems and agents may still consume them. Infrastructure, not a Google-specific lever.

E — Evidence

Question: Can AI verify your claims?

AI answer systems are trust-allocators. A claim with a credible third-party citation is stronger than the same claim sitting bare on a marketing page, even when the underlying fact is identical. A verifiable corpus is portable across models, prompts, and retrieval surfaces. A bare marketing page is a one-shot brochure that depreciates the moment a competitor publishes the same claim with sources.

The concrete failure mode: a bold quantitative claim with no method page. We saved customers 47%, with no methodology, no case study, no named client. To a human reader this is just vague. To a retrieval system it’s borderline unusable, because there’s no supporting passage to surface alongside the claim. The competitor with the linked case study wins the quote.

Major quantitative claims should link out to a method page, case study, or primary source. Original research belongs under your own URL. Case studies should carry named clients, real numbers, and dates. Reviews should exist on at least one independent platform (G2, Capterra, Trustpilot, App Store, GitHub stars, Product Hunt). Substantive articles deserve credentialed bylines. Pricing should be transparent, even if starting at or by quote is all you can publish, as long as the model is explained. Testimonials carry name, role, company, and (where possible) a link. Press, podcast appearances, talks: linked, not claimed.

The mechanism is more interesting than the recipe. The original GEO paper (Aggarwal et al., KDD 2024) reported that adding citations, quotations, and statistics could boost visibility by up to 40%, but that headline was measured against a synthesized-answer pipeline on the authors’ own benchmark, not live consumer engines. A 2025 replication on live ChatGPT, Perplexity, and Gemini by Chen et al. found the direction holds, with smaller magnitudes and significant engine-by-engine variance (Chen et al., 2025).

One nuance matters. An ACL 2024 paper (Wan, Wallace, Klein) found that current LLMs primarily score evidence on direct relevance to the query, and largely ignore stylistic cues humans treat as credibility signals (scientific references, neutral tone, that family) (ACL 2024). So Evidence-layer items earn their weight for two distinct reasons. The first is direct relevance: the supporting fact is inline and matches the user’s wording, so the page surfaces on verification-style queries. The second is displacement resistance: a credible primary source the model can ground in makes you harder to overwrite with lower-quality content on the same topic. Looks credible alone doesn’t win. Looks credible and directly relevant does.

E — Entity

Question: Does AI know exactly who or what you are?

Generative engines treat people, products, and organizations as first-class objects, separate from the pages they happen to appear on. If the entity is ambiguous, AI confuses it with another entity, describes it incorrectly, attributes it wrong, or skips it. Entity work gives AI a stable target to retrieve, connect, and cite.

Concrete failure mode: three different one-sentence descriptions across LinkedIn, X, and the About page. One says AI tooling for SMBs, another says data infrastructure for mid-market, a third says a platform for builders. A retrieval system that pulls all three has to guess which one the entity is, and guess usually means pick the most-cited one, or none. Founder ↔ company ↔ product relationships that feel obvious to you are invisible to a model that hasn’t been told.

This pattern is everywhere once you start looking for it: the founder remembers the original description, the marketing site got rewritten 18 months ago, LinkedIn was updated by the comms hire who got laid off, X still has the seed-stage tagline. The AI sees four entities. There’s one entity. Nobody told it.

The fix:

One canonical name across every surface (no drift between Acme, Acme, Inc., by Acme, the Acme app).
One one-sentence description reused everywhere.
Organization / Person / Product JSON-LD with sameAs linking every owned and platform-hosted profile back to a single canonical identity. Prioritize quality sameAs targets (canonical home, Wikidata, LinkedIn, GitHub, professional identifiers) over directory long tails that dilute disambiguation.
A Wikidata entry where notability allows.
A real About page (entity, founder, team, story, mission), not a marketing dump.
Disambiguation copy in title, description, and About when your name collides with a more famous namesake.
Consistent photos, logos, and bios across every profile.

The mechanism: LLMs and search systems build internal entity graphs, partially explicit (Knowledge Graph, Wikidata) and partially learned. A clean graph appears to reduce hallucinations and wrong-person errors. The MultiHal benchmark (Lavrinovics et al., 2025) found that knowledge-graph-grounded retrieval improves hallucination scores by roughly 0.29 to 0.42 points across evaluated models (MultiHal, 2025). That’s a controlled benchmark result on KG-RAG versus vanilla QA, not direct evidence that ticking a Wikidata box causes any specific consumer AI to behave better on your entity in particular. It supports the mechanism, not a guaranteed real-world tactic.

Entity coherence is about the entity reading as the same person, product, or organization across every surface you publish on, regardless of who owns the URL. A consistent Substack with a stable bio, byline, and sameAs back to your canonical identity will outperform an inconsistent personal site with three different bios.

N — Notability

Question: Does the wider web corroborate that you matter?

AI systems don’t only read your site; they cross-reference. Notability is the layer that earns external corroboration, and it shows up in two forms. There’s what other people say about you, and there’s the visible reach of the properties you author yourself on high-authority platforms.

Concrete failure mode: strong owned content, zero third-party footprint, no platform-hosted reach. The owned pages are clean, but no independent publication has mentioned you, Reddit’s never heard of you, no “best of” includes you, the founder has no trail outside the company, and the LinkedIn account is a single-line bio last touched in 2023. The corpus is one voice talking about itself, and AI has nothing to corroborate.

The fix is two-sided.

Third-party signal: mentions on independent publications in the category, inclusion in best of / top N / alternatives to lists, organic presence in subject-relevant subreddits and forums, backlinks from genuinely high-authority domains, Wikipedia mentions where notability standards allow, a speaking / podcast-guest / guest-writing trail, partnerships visible on the partner’s own site, and customer-side case studies hosted on the customer’s domain.

Platform-hosted reach: subscribers on a Substack or Beehiiv, stars on a GitHub repo, listeners on a podcast you host, visible follower / engagement signals on LinkedIn or YouTube, and citations or downloads on academic platforms (Google Scholar, SSRN, arXiv).

The empirical case is heavier on this layer than any other. Chen et al. found that earned (third-party) coverage moves recommendation likelihood more reliably than equivalent claims on owned pages. Semrush’s threshold-vs-linear finding (Spearman 0.57 between Authority Score and AI mentions, threshold effects beating linear scaling) means a small number of links from upper-tier domains outperforms a long tail of weak referrers, and the same study confirmed nofollow ≈ follow for AI weight, which matters because most platform-hosted notability signals on LinkedIn, Crunchbase, and similar surfaces are nofollow by default.

Community platforms are the single largest citation surface in the corpus. Otterly’s 52.5% community vs. 47.5% brand-domain split holds on a 1M+-citation sample. ALM Corp’s 325K-prompt study has Reddit at #1 and LinkedIn at #2 across ChatGPT Search, Google AI Mode, and Perplexity, with LinkedIn rising from roughly #11 in November 2025 to #5 by February 2026 on ChatGPT alone (the largest single-domain authority shift the study observed). The same ALM analysis found 75% of cited LinkedIn authors post 5+ times monthly and 95% of cited content is original (not reshares). A LinkedIn account that’s a polished bio plus three posts from 2022 isn’t doing anything for you. Real cadence with original content is what shows up in citations.

This is the layer most resistant to short-term tactics. It compounds. Notability earned this year is still earned in three years; the faked version degrades fast. Pay-to-play directory awards, press releases on aggregator sites, reciprocal-citation rings: they’re shaped like Notability and they aren’t.

4. The checklist

The checklist is what you run. It lives at seen.deeflect.com/checklist.

41 items across four layers (10 for Structure, 10 for Evidence, 10 for Entity, 11 for Notability). For each item, tick 1 if it’s true today and 0 if it’s missing, vague, or only partially true. Sum each layer.

The weakest layer caps the whole system. The shortest leg sets the height.

How to read your score:

Below half on any single layer. That’s your biggest blocker; fix it before anything else. Don’t spread effort evenly. The multiplicative property of the framework means a 9/10 in Structure paired with a 2/10 in Notability performs worse in practice than a 6/6/6/6 distribution, even when the raw sum is higher. The weakest layer caps the system.
20 to 30 of 41 total. One or two weak layers are carrying the rest. Lift comes from raising the weakest letter.
30+ of 41 total. Foundation’s solid. Now make it compound: Notability durability over time, Evidence freshness, the recurring content that keeps the corpus current.
41 of 41. Usually means the rubric is too forgiving for your case. Tighten the bar or re-run honestly.

Re-run quarterly. The score should move with deliberate work, not by luck.

Some items are marked [site]: skip them without penalty if you’re a pure platform-hosted entity with no primary owned site, and renormalize over the items that applied. URL ownership isn’t required, coherence across whatever surfaces you publish on is.

The checklist diagnoses whether the foundation is in place for AI to behave well on your entity. The behavior itself depends on prompt, surface, model, account state, location, time, and a dozen other things you don’t control.

5. What SEEN does not claim

SEEN doesn’t guarantee citations, mentions, recommendations, or traffic. What it improves is the substrate. AI answer surfaces are stochastic: even a well-structured corpus gets ignored on some queries and over-cited on others. Anyone selling guaranteed AI rankings is selling the SEO snake oil of 2007 with a new label and a chatbot logo.

SEEN doesn’t replace SEO. Different surfaces, different signals, shared infrastructure (crawl, index, structure, links). Most entities need both. SEEN assumes the SEO basics are handled; if they aren’t, fix those first, because they’re the substrate every other layer sits on top of.

llms.txt, schema, and one-shot rewrites are infrastructure, not magic levers. Google has stated explicitly that it does not use llms.txt for AI Overviews or AI Mode. Other AI systems and agents may still consume the file. The right framing is low-cost infrastructure that costs nothing to ship and might be read somewhere, not the secret trick the SEO industry doesn’t want you to know. Schema helps when it matches visible content; it stops helping the second it contradicts the page. No single tactic flips a weak corpus into a recommended one.

AI behavior varies by prompt, surface, model, account state, location, and time. A query pulling your competitor today might pull you next month after a model update or a routing change you’ll never see logged. Cross-engine consistency is low: Chen et al. confirmed that a tactic lifting visibility on one engine doesn’t necessarily lift it on another. Engine-specific results stay engine-specific.

Single screenshots are weak evidence. The category is full of I asked ChatGPT and look what it said posts. One run from one account on one day proves almost nothing. Durability across repeated runs, prompt variants, and multiple engines is what actually matters, and almost nobody is measuring that yet.

Most GEO content right now is half-built. The author optimized one layer and skipped three. They publish a Structure-only checklist (header tags, schema, FAQs) and call it a framework. They post how to rank in ChatGPT on a personal blog with no Wikidata entry, no third-party trail, no sameAs graph, one cited screenshot, and the implicit promise that this is the whole job. Run-grade SEEN work is the opposite: a 41-item audit run honestly, the lowest-letter fix shipped this quarter, a re-run scheduled for next, and at least one external surface where the entity reads as itself.

SEEN is the foundation that gives AI a fair chance to find, read, verify, and recommend you correctly. A wrong foundation can’t be papered over with tactical optimization later. Different problem.

6. The bet

Being findable this decade comes down to whether AI systems can cleanly find, read, verify, and recommend you. Search engines aren’t going anywhere; rank still controls clicks on the surfaces where the answer hasn’t already been generated for the user. But for a meaningful slice of queries (about 18% of Google searches in the Pew March 2025 panel, derived from 12,593 of 68,879 sampled searches showing an AI summary, plausibly higher now and rising fast on the surfaces that aren’t Google), the answer is the surface. The page underneath functions as the model’s reference material, not as the user’s destination. Most entities aren’t built for that yet.

Most “AI visibility” content right now is screenshots, vibes, and rebranded SEO with the word generative duct-taped to the front. Most companies don’t know whether their pricing is extractable, whether their entity graph is coherent, whether they appear in any community surface AI cites, or whether their corpus survives a second-engine run. They aren’t ignoring the problem; they literally cannot see it. There’s no analytics dashboard for the AI gave up on your page, because the failure mode is silence.

Build the entity with the cleanest corpus in your category before AI surfaces become the default way buyers, hires, and partners find companies. Most operators won’t get there, because they’ll keep getting blanked in their category without ever naming the failure mode, running the audit, or figuring out which layer was capping the system. SEEN exists so being ready is a checklist instead of a guess.

None of this shows up in a quarterly dashboard, which is why most people miss it. A coherent entity graph gets more durable as the web links into it. Platform-hosted reach grows by posting on cadence with real content rather than reshares. Third-party Notability earned this quarter is still earned in three years. Structure and Evidence are mostly one-time substrate work. The whole thing stacks until the day a buyer asks an AI in your category and your name is the one returned.

This is corpus engineering, not copywriting, not SEO with a coat of paint, and not the LinkedIn-thought-leader version of GEO. The work is engineering an evidence corpus so that any reasonable retrieval system, given any reasonable question your buyer might ask, can find you, read you, verify you, and recommend you for the right job.

The bet behind SEEN: the operators who treat their corpus as engineering are the ones who get recommended in the AI layer that matters in 2027 and beyond. Not the loudest GEO accounts on X. The ones who ran the checklist, fixed the weakest layer first, and re-ran it the next quarter.

7. What to do next

Run the checklist against your own entity. Free at seen.deeflect.com/checklist. About 30 minutes the first time. Score honestly. Fix the lowest letter first.
Subscribe to the quarterly state-of-GEO report at seen.deeflect.com. Recurring data on which engines cite what, what’s moving, what’s still hype. One email per quarter, no spam.
DM me if you want your entity featured in a public SEEN audit. The framework run end-to-end on a real company, findings posted publicly (anonymized on request). Bias toward people who want the feedback, not the marketing.

— Dee Kargaev, May 2026