All articles
AI & Automation·

I Scored 16,000 Businesses for AI Readiness

We built a crawler that scans business websites for 13 AI discoverability signals and scored 16,000 businesses across Western North Carolina. The results were worse than we expected.

I Scored 16,000 Businesses for AI Readiness

It started with a simple question: if someone asks ChatGPT for a plumber in Asheville, how does ChatGPT decide which plumber to recommend?

The answer turned out to be complicated. AI systems look for structured data, machine-readable signals, trust indicators -- dozens of things that most business owners have never heard of. And there was no way for a business owner to check how they scored. No audit tool. No "AI readiness" equivalent of Google's PageSpeed Insights.

So we built one.

What BluePages Is

BluePages is a free tool that scores businesses on 13 specific signals that affect how AI systems find and recommend them. We crawled 16,000 businesses across 150 cities in Western North Carolina, scored each one, and put the results on a public site where anyone can look up their business.

4,504 of those businesses have full scores based on live website data. The rest -- around 11,500 -- don't have websites at all. They exist as Google Business listings and nothing more.

That ratio alone tells a story. Roughly 70% of the businesses in Western NC don't have a website for AI to read.

How the Crawler Works

The scoring engine is a Python crawler backed by a SQLite database. For every business, we check 13 signals across four categories: presence, security, marketing, and technical infrastructure.

Some signals are binary -- you have SSL or you don't. Some are weighted -- having a website is worth more points than having analytics installed, because without a website nothing else matters. The 13 signals roll up into a score from 0 to 100.

The 13 signals span four categories: presence (website, Google Business Profile, social media), security (SSL, email authentication), marketing (analytics, mobile responsiveness, contact info, content freshness), and technical (structured data, page speed, CRM indicators, technology stack).

Each signal has a weight. A website is the foundation — without it, nothing else can score. Structured data is heavily weighted because it's the single biggest factor in whether AI can understand what your business does. We break down all 13 signals and what to do about each one in Your Business Is Invisible to AI — Here's the Fix.

What the Data Actually Shows

The average score across all 4,504 scored businesses is around 40 out of 100.

That's not great. It means the typical business in Western NC is sending fewer than half the signals that AI systems use to decide who to recommend. But averages hide the real story. Here's what we actually found when we dug into the data.

Most businesses cluster between 25 and 50. They have a website. It loads. It's got SSL because their hosting provider turned it on automatically. Maybe there's a phone number on the contact page. That's about it. No structured data, no email authentication, no sign that anyone has touched the site in the last year.

A small group scores above 70. These tend to be businesses with someone technical managing their web presence -- or businesses that hired a competent web developer and gave them the budget to do things right. They've got JSON-LD schema. Their email infrastructure is configured. Their site loads fast on mobile. They didn't do these things for AI. They did them because they're good web practices. But now those practices carry a second benefit.

The 11,500 with no website at all score the lowest. They might have a Google Business listing with some reviews and hours, but that's it. An AI system trying to learn about these businesses has almost nothing to work with. No structured data. No content to read. No signals to evaluate.

The gap between the top and bottom is enormous. A business scoring 80 is sending clear, consistent, machine-readable signals across every category. A business scoring 15 is essentially invisible. And most businesses are much closer to 15 than to 80.

A glowing data stream flowing through a mountain valley

The Signals Nobody Has

Some signals are almost universally missing.

Structured data (JSON-LD). This is a block of code -- invisible to visitors -- that tells AI systems exactly what your business is, where it is, what services you offer, and how to contact you. Without it, AI has to scrape your website and guess. With it, you're handing the AI an organized file. Almost nobody has it. The businesses that do score dramatically higher.

Email authentication. SPF, DKIM, and DMARC records. Three DNS entries that take 30 minutes to set up and cost nothing. They prove your business's email domain is legitimate. They also keep your emails out of spam folders. Very few small businesses have all three configured.

Content freshness. A lot of sites haven't been updated in years. The copyright says 2023. The blog's last post is from 2021. The "latest news" section is three items from two years ago. To an AI system, this looks like the business might be closed.

These three signals -- structured data, email authentication, and fresh content -- represent the biggest opportunity. They're free or close to free. They take a few hours to set up. And they move the needle more than almost anything else.

The Tech Behind It

For the technically curious: the stack is Python for the crawler, SQLite for the database, Next.js for the frontend, Vercel for hosting, and a Lightsail API server for the scoring engine.

The crawler runs in batches. We started with 17 cities and 1,166 scored businesses. Then we expanded -- batch by batch -- until we hit 150 cities and 4,504 scored businesses across all of Western North Carolina. The full database has over 16,000 entries including the unscored businesses that are Google-only listings.

We built a scheduled recrawl system so scores stay current. Businesses get rescanned weekly. If someone improves their site, their score updates automatically.

The frontend shows every business's score broken down by signal. You can see exactly which signals you're hitting and which you're missing. City pages show aggregate data -- how your city compares to others. There's no paywall, no "enter your email to see your score" gate. The data is just there.

What We Got Wrong (So Far)

Honest assessment: the scoring model is version one. We know it has gaps.

Signal weighting is subjective. We weighted the signals based on our understanding of what AI systems prioritize. But different AI systems -- ChatGPT, Claude, Gemini, Perplexity -- probably weight signals differently. We don't have ground truth data on exactly how much JSON-LD matters versus email authentication versus content freshness. Our weights are informed guesses.

Some signals are proxies. Detecting a CRM from page source code is imprecise. A business might use HubSpot but load it asynchronously in a way our crawler doesn't see. We might miss analytics that's server-side only. The technical signals have false negatives.

WNC only. We haven't expanded beyond Western North Carolina. The data is hyper-local. That's useful if you're a business in Asheville or Hendersonville or Brevard. It's not useful if you're in Charlotte or Atlanta or anywhere else. Expanding is on the roadmap, but we wanted to go deep in one region rather than shallow across the country.

Scoring doesn't equal recommendation. A high BluePages score doesn't guarantee ChatGPT will recommend your business. It means your business is sending the right signals. But AI recommendations depend on dozens of other factors -- reviews, citations, the specific question someone asks, the model's training data. BluePages measures readiness, not outcomes.

We'll keep improving the model. Version two will likely add signal-specific weights based on actual AI recommendation testing -- asking AI systems about businesses and correlating the recommendations with signal data to see what actually moves the needle.

What Comes Next

We're improving the model. Version two will add signal-specific weights based on actual AI recommendation testing — asking AI systems about real businesses and correlating the recommendations with signal data to see what actually moves the needle. Right now our weights are informed guesses. We want ground truth.

We're also expanding the crawler. 150 cities in Western NC is a start. The architecture supports any region — we just need to point it somewhere new.

The average score of 40 tells us something important: the bar is on the floor. For the full breakdown of what each signal means and how to fix yours, read Your Business Is Invisible to AI — Here's the Fix.

Look up your business on BluePages — it's free. If the score is lower than you'd like, we can help.

Share:

Stay Connected

Get practical insights on using AI and automation to grow your business. No fluff.