What is technical SEO for AI search?

Technical SEO for AI search is the set of crawl, render, and structure rules that decide whether ChatGPT, Perplexity, Gemini, and Google AI Overviews can cite your pages. It covers 22 items across three tiers: crawlability and rendering (robots.txt, sitemap, JS rendering, llms.txt, canonicals), structure and extractability (heading hierarchy, semantic HTML, schema markup, anchor links), and performance signals (LCP under 2.5s, INP under 200ms, CLS under 0.1). It is not a longer version of a 2019 audit. AI engines extract scoped answers from structured HTML, so the rules changed.

How do I check if AI crawlers can access my site?

Open /robots.txt and look for blocks on GPTBot, PerplexityBot, ClaudeBot, and Google-Extended. Many sites accidentally block these via outdated WordPress plugins or boilerplate Disallow rules. Unless you have a specific reason to block them, allow all four. Then run a live test in Google Search Console's URL Inspection tool to confirm the rendered HTML matches the source HTML. If critical content only appears after JavaScript executes, AI crawlers that do not render JS will miss it. Move above-the-fold content to server-side rendering using Next.js SSR, ISR, or static generation.

What is llms.txt and do I need it?

llms.txt is a root-level file at /llms.txt, formatted per the llmstxt.org spec proposed in late 2024 and now showing up in production crawlers. Include a one-sentence site definition, links to your top 10 to 20 most important pages, and a brief description for each. It is low-cost to implement and explicitly signals which pages you want AI engines to ingest. Add it alongside your sitemap, not in place of it. For most B2B and ecommerce sites, this is a one-hour engineering task that takes effect within the next crawl cycle.

Which schema markup matters most for AI citations?

FAQPage schema is the highest-leverage single addition for AI citation pickup. Beyond that, implement Organization, WebSite, and BreadcrumbList sitewide as the baseline. Add page-type schema where relevant: Article for posts, Product and SoftwareApplication for SaaS pricing, HowTo for instructions, Service for service pages, LocalBusiness for location pages. Validate everything with Google's Rich Results Test. Even where Google deprecated FAQ rich results in classic SERPs, the schema still functions as a citation signal for Perplexity, Bing Chat, and AI Overviews.

What Core Web Vitals thresholds should I hit in 2026?

Three numbers at the 75th percentile of real users: LCP under 2.5 seconds, INP under 200 milliseconds, CLS under 0.1. INP replaced First Input Delay as a Core Web Vital in March 2024, so older audits measuring FID are out of date. Also keep TTFB under 600ms, since the rest of the performance stack cannot compensate for a slow server. Fix order: optimize the LCP hero image, remove loading=lazy from the LCP element, defer non-critical JS for INP, reserve image and ad dimensions for CLS, enable a CDN for TTFB.

Technical SEO Checklist for Brands That Don't Rank in AI Search 2026

Most "technical SEO" checklists still optimize for blue links. The 22 items below decide whether ChatGPT, Perplexity, and AI Overviews can cite you in 2026. They are ranked by impact, grouped into three tiers, and followed by a 30-day fix order you can hand to an engineer.

A technical SEO checklist for AI search is not a longer version of the 2019 audit. AI engines crawl differently, render differently, and cite differently than Googlebot did when ten blue links were the whole result set. AI Overviews rolled out broadly in May 2024[1]. INP replaced FID as a Core Web Vital in March 2024[2]. The llms.txt proposal landed in late 2024 and is now showing up in production crawlers[3]. If your checklist predates these shifts, it is auditing the wrong site.

This is the 22-point list. Each item names the tool to use and the threshold that counts as "passing."

Why the Technical SEO Checklist Changed

Three shifts force the rewrite:

AI engines crawl differently. Perplexity, ChatGPT, and Gemini bots respect different rules than Googlebot. Some honor robots.txt strictly. Some do not render JavaScript. Some only fetch the first 50KB of HTML.
AI engines render differently. AI Overviews extract specific facts from specific HTML elements. Buried answers below client-side rendered components frequently get missed.
AI engines cite differently. Citations go to pages with extractable, scoped, structured answers. Generic 2,000-word blog posts with the answer in paragraph 14 lose to a 400-word FAQ page with schema.

If your site ranks in Google's classic results but never gets cited in AI Overviews or ChatGPT, the gap is almost always in this checklist. For the parallel discussion of what content earns citations, our guide on how to get cited in AI Overviews covers the content side.

Tier 1: Crawlability and Rendering (8 Items)

If AI bots cannot reach or read the page, nothing else matters.

1. robots.txt Allows AI Crawlers

Check /robots.txt for blocks on GPTBot, PerplexityBot, ClaudeBot, Google-Extended.
Decision: allow them unless you have a specific reason to block.
Many sites accidentally block these via outdated WordPress plugins or boilerplate Disallow rules. Fix immediately.

2. Sitemap Is Current and Submitted

Sitemap exists at /sitemap.xml and is referenced in robots.txt.
Submitted in Google Search Console and Bing Webmaster Tools.
Only includes canonical, indexable URLs. No noindex, no redirects, no 404s.
Updated within 7 days of a new page publishing.

3. JavaScript Rendering Depth

Test with Google's Mobile-Friendly Test and the URL Inspection tool's "Live Test."
If the rendered HTML differs significantly from the source HTML, AI crawlers that do not render JS will miss content.
Decision: render critical content server-side. Use Next.js SSR, ISR, or static generation. Avoid client-side-only rendering for above-the-fold content.

4. llms.txt Implementation

Add /llms.txt at the root. Format per the llmstxt.org spec[3].
Include a one-sentence site definition, links to top 10 to 20 most important pages, brief descriptions for each.
This is low-cost and explicitly signals which pages you want AI engines to ingest.

5. Canonical Hygiene

Every page has a self-referencing <link rel="canonical"> or a canonical pointing to the preferred URL.
No conflicting signals (canonical + noindex + sitemap inclusion all pointing different ways).
Use Screaming Frog to audit. Sort by canonical URL and flag mismatches.

6. Redirect Chain Audit

Zero redirect chains of length 3 or more.
Zero redirect loops.
301 for permanent moves, 302 only for temporary.
Each hop costs crawl budget. AI bots typically give up after 2 hops.

7. Status Code Review

200 for all canonical pages.
404 for genuinely missing pages, not soft 404s.
No 500s in the last 30 days of Search Console crawl stats.
Audit with Screaming Frog. Anything over 0.5% non-200 on canonical URLs is a yellow flag.

8. Mobile Rendering Parity

Mobile-first indexing has been the default for years. AI engines also prioritize the mobile rendering.
The mobile HTML must contain the same content as desktop. No hidden or stripped sections.
Test with the URL Inspection tool. If desktop renders 2,000 words and mobile renders 800, fix immediately.

Tier 2: Structure and Extractability (7 Items)

AI engines extract scoped answers. Structure is what makes them extractable.

9. Heading Hierarchy

One <h1> per page, matching the page topic.
Logical <h2> and <h3> nesting, no skipped levels.
Headings phrased as scannable answers, not marketing taglines. "What is technical SEO?" beats "The Technical Edge."

10. Semantic HTML

Use <article>, <section>, <nav>, <aside>, <main>, <header>, <footer>.
Avoid wrapping everything in <div>s with class names. AI crawlers and assistive tech both rely on semantic tags.

11. Schema Markup

Implement at minimum: Organization, WebSite, BreadcrumbList.
Add page-type schema where relevant: Article, Product, FAQPage, HowTo, Service, LocalBusiness.
Validate with Schema.org and Google's Rich Results Test.
FAQ schema is the highest-leverage single addition for AI citation pickup.

For a deeper structural example, see our FAQ SEO strategy guide.

12. Table of Contents With Anchor Links

Long-form pages (1,500+ words) need a TOC.
Each section heading needs an id attribute matching the TOC anchor.
This lets AI engines link to specific section answers, not just the page URL.

13. Descriptive Anchor Text

Internal links use 3 to 6 word descriptive anchors.
No "click here", "learn more", "this post".
Anchor text trains AI engines on what each linked page covers.

14. Internal Link Graph Density

Every important page should be reachable in 3 clicks from the homepage.
Pillar pages should have 10+ internal links pointing to them from related content.
Use Screaming Frog's "internal link count" column to find orphans and under-linked priorities.

15. Citation-Worthy Data Structures

Lists, tables, definition pairs, and short paragraph answers get extracted more often than prose.
For each priority page, identify the 3 facts most worth citing and present them in a list or table near the top.
AI Overviews pull from bullet lists and tables disproportionately compared to running prose.

Tier 3: Performance and Signals (7 Items)

Slow pages get partially rendered or skipped entirely by AI crawlers with tight time budgets.

16. LCP Under 2.5 Seconds

Largest Contentful Paint, measured at the 75th percentile of real users[2].
Check in PageSpeed Insights and Search Console's Core Web Vitals report.
Fix order: optimize hero image, preload critical resources, reduce render-blocking JS.

17. INP Under 200 Milliseconds

Interaction to Next Paint replaced First Input Delay in March 2024[2].
Measure with Chrome's Performance panel and the web-vitals JS library.
Fix order: break up long tasks, defer non-critical JS, audit third-party tags.

18. CLS Under 0.1

Cumulative Layout Shift, also at the 75th percentile.
Reserve dimensions for images, ads, and dynamic content.
Use width and height attributes on every image.

19. Image Optimization

Modern formats (WebP or AVIF) for all hero and content images.
Responsive images with srcset and sizes.
Compress to 80% quality. Target under 200KB for hero, under 100KB for inline.

20. Lazy-Loading Rules

loading="lazy" on below-the-fold images.
Do not lazy-load the LCP image. This is the most common mistake.
Test the LCP element in PageSpeed Insights to confirm it loads eagerly.

21. HTTPS and HTTP/2 or HTTP/3

TLS 1.2 or higher.
HTTP/2 minimum, HTTP/3 preferred for the multiplexing gains.
Verify in browser dev tools under the Network tab.

22. Server Response Time (TTFB)

Time to First Byte under 600ms at the 75th percentile.
If TTFB is high, the rest of the performance stack cannot compensate.
Fix order: enable a CDN, cache HTML at the edge, audit slow database queries.

For the dedicated Core Web Vitals fix sequence on a Next.js or WordPress stack, see our SEO audit service.

The 30-Day Fix Order

Do not work the list top-to-bottom. Work it by impact and effort.

Week 1: Diagnostic and Quick Wins

Run a full Screaming Frog crawl.
Run PageSpeed Insights on the top 10 pages.
Pull Core Web Vitals report from Search Console.
Fix: robots.txt blocks on AI crawlers (item 1), missing sitemap entries (item 2), canonical conflicts (item 5).
These three alone often unblock 20 to 40% of indexation issues.

Week 2: Schema and Structure

Implement Organization, WebSite, and BreadcrumbList schema sitewide (item 11).
Add FAQPage schema to your top 10 pages with Q&A content.
Audit heading hierarchy on top 20 pages (item 9).
Add id anchors and TOC to long-form pages (item 12).

Week 3: Performance

Optimize the LCP image on the top 20 pages (item 16, item 19).
Audit lazy-loading and remove loading="lazy" from any LCP element (item 20).
Audit INP on the top 20 pages, defer non-critical JS (item 17).
Enable HTTP/2 or HTTP/3 if not already on (item 21).

Week 4: Rendering and Crawlability

Audit JavaScript rendering on top 20 pages, move critical content server-side where it is client-side (item 3).
Add /llms.txt (item 4).
Fix redirect chains (item 6).
Run a re-crawl and compare deltas.

Re-measure at day 60 and day 90. The pages that received Tier 1 and Tier 2 fixes should show citation pickup in AI Overviews within 60 days if the content itself is citable.

How to Sequence Larger Programs

If this list reads like a quarter of work, that is because it is. The default failure pattern is to fix items 9, 11, and 16 on the homepage and call it done. The actual lift comes from applying the full 22-item list across your top 50 pages, then your top 200, then sitewide.

For brands without a technical team that can execute this in 30 days, we run the audit and the fixes under one engagement. Our managed growth service covers technical SEO execution as part of the recurring scope, not as a one-off project handed to your engineers. For deeper context on how AI search differs from classic SEO, see our AEO vs SEO vs GEO breakdown.

A technical SEO checklist for AI search in 2026 is not optional infrastructure. It is the floor that decides whether your content has a chance at citation. Build the floor, then earn the citations.