How to Check If ChatGPT Has Indexed Your Website (2026)
To check whether ChatGPT can use your website, stop asking whether you're "indexed" — ask whether ChatGPT will surface and cite your URL when someone asks a relevant question. ChatGPT does not build a public, queryable index of the web the way Google does. It retrieves a small set of sources at answer time through its search layer, reads them, and cites a few. So the practical test is not "is my page in a database" — it's "when I ask a question this page should answer, does ChatGPT find it, read it, and link to it." You can spot-check that manually in a few minutes, and the rest of this post shows you how, plus the technical reasons a live page can be invisible to ChatGPT even while it ranks on Google.
This is the ChatGPT-specific companion to the broader retrievability pillar and the Perplexity spoke. ChatGPT is the engine where this check is most reliable today, because its citations are visible inline and its crawler behavior is documented by OpenAI — so it's the right place to start.
How ChatGPT actually pulls sources
When ChatGPT answers a question that needs current or factual information, it issues a search, retrieves a handful of candidate pages, and synthesizes an answer that cites a few of them inline. You can see the citations in the response — small linked references next to the sentences they support. That is the entire visible surface: a few cited URLs per answer, not a ranked page of ten blue links.
Two things follow from that design. First, the competition per answer is brutal — being the 8th-best source for a query usually means being cited zero times, not "page two." Second, ChatGPT's selection skews toward consensus and encyclopedic sources: pages that corroborate what other reputable pages say, that are easy to parse, and that carry recognizable entity signals. A page that is technically live but says something only your site says, in a format that's hard to extract, is a weak retrieval candidate even when it loads fine in a browser.
OpenAI's own documentation is the anchor here. The company publishes the user agents it operates and how they respect robots.txt — which is what makes ChatGPT checkable in a way some other engines aren't.
The manual check: does ChatGPT surface your URL?
You don't need a tool to do a first-pass check on a single page. The general technique is to make ChatGPT try to retrieve a page you control and watch whether it surfaces and cites your URL.
There are two reliable variants:
- Unique-phrase check. Copy a distinctive sentence from the page you care about — ideally a phrase that appears nowhere else on the web — and ask ChatGPT (with search enabled) a question whose natural answer would pull that page up. If the page is retrievable, ChatGPT should surface it and cite your URL. If a phrase that exists only on your live page never produces your page, that's a signal something is blocking retrieval.
- URL/site-indicator check. Point ChatGPT at the page or domain directly and ask it to read and summarize what's there. If ChatGPT can fetch and accurately describe the page's actual content, the page is reachable by its live-fetch bot. If it can only guess, hedges, or describes generic boilerplate, the page may be blocked or render-broken for the bot.
A few discipline notes. Run each check more than once — ChatGPT's answers are non-deterministic, and a single miss isn't proof of a problem. And ignore the 2024-era "operator tricks": the naive site:-style and copy-paste prompt hacks that circulated two years ago are unreliable in 2026 because ChatGPT's search behavior has changed and the results vary run to run. Treat the unique-phrase and URL checks as your primitives, repeat them, and read the pattern rather than any single answer.
What this manual check is good for: confirming or ruling out a problem on one important page. What it is not good for: telling you which of dozens of pages on a site are retrievable, or catching the day a robots.txt change quietly locks ChatGPT out of your blog.
What blocks ChatGPT from retrieving a page
When a live page won't surface in ChatGPT, the cause is almost always one of four things. Run this as a checklist on any page that should be appearing and isn't.
| Factor | What to check | Why it matters |
|---|---|---|
| Bot access (robots.txt) | Is OAI-SearchBot allowed? GPTBot? ChatGPT-User? | These are three separate OpenAI agents. OAI-SearchBot powers ChatGPT search indexing; ChatGPT-User does live, user-triggered fetches; GPTBot collects training data. Block the wrong one and you're invisible to that path while still ranking on Google. |
| Rendering | Does the page's content exist in the server-rendered HTML, or only after client-side JavaScript runs? | Crawlers that don't execute JavaScript see an empty shell. Content that depends on client-side hydration can be unreadable to a fetch bot. Server-side rendering or static HTML is the safe default. |
| Discoverability | Is the page in your XML sitemap? Is it internally linked from pages that are themselves crawlable? | Orphaned pages with no internal links and no sitemap entry are hard to discover. Inclusion in a sitemap plus real internal links is the baseline. |
| Content quality | Is the page thin, duplicated, or boilerplate that restates other sources without adding anything? | Even a perfectly accessible page loses the per-answer competition if it's thin or duplicative. ChatGPT favors sources that add corroborated, extractable substance. |
The first factor is the one agencies miss most often. A firewall or CDN rule, or a robots.txt that was written for Googlebot and never updated for AI crawlers, can allow Google through while quietly blocking OAI-SearchBot. The page ranks on Google, the client assumes everything is fine, and ChatGPT has never been able to read it. Because OpenAI publishes the exact user-agent strings, this is verifiable — but only if someone actually checks the bot-level access rather than eyeballing the page in a browser.
Where OpenLens fits
The manual check works for one page at a time. The problem is that real sites have hundreds of pages, robots rules change, and ChatGPT's behavior moves — so a one-time spot-check goes stale.
OpenLens automates the retrievability question across every page rather than one at a time, and maintains the method as ChatGPT changes so the check doesn't quietly break. Its Site & Agent Readiness audit produces a 0-100 score that flags exactly the failure modes in the checklist above — bot-access blocks and rendering problems — and it spoofs the relevant crawlers (OAI-SearchBot, GPTBot, ChatGPT-User, and others) to report back whether each one is actually being blocked at your server or CDN, not just whether a human browser can load the page. That last point matters: bot-level blocks are invisible to anyone testing in a normal browser.
ChatGPT is the engine OpenLens checks most reliably today — visible citations and documented crawlers make it the cleanest signal. OpenLens tracks visibility across 7 AI platforms, and the free tier needs no credit card, so you can run the readiness audit on a client site before deciding whether systematic tracking is worth it. If you are comparing options, our rundown of the best free AI visibility tools for marketing agencies and the detailed OpenLens versus Profound comparison lay out where each tool fits.
This won't replace judgment — a low readiness score tells you where to look, not what to write — but it turns "I think we might be blocked in ChatGPT" into a number you can hand a client and a list of pages to fix.
The short version
ChatGPT doesn't index your site the way Google does — it retrieves and cites a few sources per answer. So the question to answer is whether ChatGPT will surface your URL, not whether you're "in the index." Spot-check a single important page with a unique-phrase or URL check, repeated a few times. If a live page won't surface, walk the four-factor checklist: bot access, rendering, discoverability, content quality. And because robots rules and ChatGPT's behavior both drift, treat any check — manual or automated — as a snapshot, not a settled answer.
Last updated June 18, 2026.
Sources: OpenAI bot documentation, GPTBot / OAI-SearchBot / ChatGPT-User user agents and robots.txt behavior (platform.openai.com/docs/bots); Google Search Central documentation on crawling, indexing, and rendering (developers.google.com/search); Semrush, AI Overviews and the future of search study (2024-2026); BrightLocal, Local AI Search Report 2026; Aggarwal et al., GEO: Generative Engine Optimization (Princeton/Georgia Tech/Allen Institute, 2024) on how generative engines select and cite source content.
Frequently Asked Questions
- Does ChatGPT crawl my website?
- Not in the way Google does. OpenAI runs three separate bots — GPTBot (training data), OAI-SearchBot (search indexing), and ChatGPT-User (live fetches triggered by a user's question). The one that matters for being cited in answers is OAI-SearchBot. If your robots.txt blocks it, your pages can't be surfaced in ChatGPT search, even if your site ranks well on Google.
- Does ChatGPT have an index like Google?
- Not a public one you can query. ChatGPT doesn't maintain a browsable index of your site the way Google Search Console reports indexed pages. It retrieves a handful of sources at answer time through its search layer. So the real question isn't 'am I indexed' — it's 'will ChatGPT surface and cite my URL when someone asks a relevant question.'
- How do I check if a specific page is retrievable in ChatGPT?
- Spot-check it. Take a unique phrase from the page — a sentence that appears nowhere else — and ask ChatGPT (with search on) a question that should pull it up, or reference the URL directly, and see whether ChatGPT surfaces and cites that page. If it consistently can't find a page that's live and unblocked, you likely have a retrieval problem worth diagnosing.
- Do the old ChatGPT 'site:' operator tricks still work?
- Not reliably. The naive operator and prompt tricks that circulated in 2024 produce inconsistent results in 2026 because ChatGPT's search behavior changed and answers are non-deterministic. A single run tells you little. You need a unique-phrase or URL-reference check, repeated, rather than a one-off operator query.
- Why isn't my page showing up in ChatGPT even though it ranks on Google?
- The most common cause is bot access — your robots.txt or firewall allows Googlebot but blocks OAI-SearchBot or GPTBot. Other causes: the page renders only via client-side JavaScript, it isn't in your sitemap or internally linked, or the content is thin or duplicated. Google and ChatGPT use different crawlers and different selection logic, so ranking on one doesn't guarantee the other.
- How often should I re-check ChatGPT retrievability?
- Quarterly for a stable site, and again 4-6 weeks after any structural change — a robots.txt edit, a redesign, a migration to a JavaScript framework, or a new firewall rule. ChatGPT's search layer and OpenAI's crawler behavior shift over time, so a check that passed six months ago is not evidence the page is retrievable today.
Related reading
- How to Check If AI Engines Have Indexed Your Website (2026)
- How to Check If Perplexity Has Indexed Your Site (2026)
- How to Check If Your Business Appears in ChatGPT, Google AI Overviews, Perplexity, and DeepSeek — A Free 5-Minute Method
- What Gets Quoted vs Excerpted in ChatGPT: 6 Sentence Patterns That Win Citations