I Scanned 350 B2B Companies This Month. Here's the Pipeline.

This month I ran 350 new AI visibility scans for the May 2026 benchmark.
→ 4 AI platforms per company → 4 dimensions scored per company → ~$30 in API spend, end-to-end → ~4 hours of compute, mostly idle
The pipeline runs as one Python orchestrator on a laptop. No infrastructure. No platform fees. No SaaS subscription replacing what an API call already does.
This is the build log.
The Problem
If you want a benchmark, you have two options:
→ Pay a tool. Most AI visibility platforms charge per-prompt or per-tracked-brand, which scales linearly. 350 companies × 20 prompts × 4 platforms is north of $1,000/month on most pricing pages. → Build the pipeline. APIs are cheap. The expensive parts of these tools are dashboards, not data.
I built the pipeline. The dashboard layer is Citation Scope.
The market context: organisations with automated competitive intelligence report 85–95% reduction in manual research time. Sales teams currently spend 8–12 hours per rep per month on competitor research. That is the cost ceiling. Anything above zero hours is automation overhead worth removing.
The Architecture
INPUT (CSV)
│
▼
ai_presence_scanner.py
│
├──► OpenAI (gpt-5)
├──► Gemini (2.5 pro)
├──► Brave Search API
└──► Tavily Search API
│
▼
SCORE (4 dimensions × 4 platforms)
│
▼
Supabase (Postgres)
│
▼
score.gtmsignalstudio.com (Next.js, public)
Six moving parts. Each is replaceable.
Step 1 — The Input Layer
A CSV with company_name, domain, category_keyword. That is the entire input contract.
For the benchmark, the CSV comes from public sources: Companies House filings, LinkedIn search, sector lists. For client work, the CSV comes from a target list.
Keeping the input as a flat CSV means the same scanner works for a benchmark of 500 companies and a single ad-hoc lookup. No schema migration to add a row.
Step 2 — The Scanner
ai_presence_scanner.py is the core. For each company:
→ Generate a buyer-intent prompt set from the category keyword → Send each prompt to four APIs in parallel → Parse the response for brand mentions, citations, entity description → Score the four dimensions → Persist the raw response (audit trail) and the score (analytics)
Multi-API matters. Only 6% of brands in our April benchmark appeared on all four platforms. A single-platform scan produces unreliable signal — you might miss the platform where your buyer actually is.
Cost breakdown for one company across four platforms with ~20 prompts each:
| Platform | Cost per scan |
|---|---|
| OpenAI (gpt-5-mini) | ~$0.04 |
| Gemini (2.5 pro) | ~$0.03 |
| Brave Search | ~$0.01 |
| Tavily | ~$0.01 |
| Total per company | ~$0.09 |
500 companies × $0.09 = $45 for a full benchmark edition. In practice closer to $30 because of caching and prompt deduplication.
Step 3 — The Resume Layer
Most batch scans fail at scale. Rate limits. Network blips. One platform returning a 500. The naive approach is to crash and lose the run.
run_monthly.py solves this by making every scan idempotent.
→ Each company is a row in a queue table
→ Status: pending → in_flight → done or error
→ A failed scan stays in error with the exception
→ Restarting the orchestrator picks up only pending rows
→ Re-running specific platforms is a one-line filter
This is unglamorous and worth more than any other part of the system. A 500-company scan is a 4-hour wallclock job. Without resume, every transient error costs you the whole run.
Step 4 — Storage and Surfacing
Raw responses go to Supabase Postgres. One table per platform, one table for composite scores. Public benchmark data is exposed read-only to the score.gtmsignalstudio.com frontend, which is a Next.js app on Vercel.
The split between scanner and frontend is deliberate. The scanner can change weekly — new platforms, new prompts, new scoring weights — without breaking the public surface.
What the Pipeline Does Not Do
→ It does not lift data from gated tools (Otterly, Profound, Brandlight). The scanner uses public APIs and parses public responses. Output is reproducible by anyone with the same keys. → It does not replace human review for client deliverables. It generates the score; the brief still gets human eyes before sending. → It does not do "AI visibility optimisation". That is a separate problem requiring content, entity, and PR work. The pipeline measures. It does not fix.
What I Would Build Next
Three things on the backlog:
→ Trend tracking. Re-scan the same company monthly, expose the delta. Currently the dataset supports this; the surface does not. → Cohort comparisons. Compare a target list against an industry baseline. Useful for sales conversations. → Citation breadth deep-dive. When a brand is cited, log every source URL. Build a backlink graph for AI specifically.
The General Lesson
If a tool's pricing scales linearly with rows scanned, and the underlying API does not, you can usually rebuild it for the cost of the API plus a weekend.
The May benchmark drops the week of 4 May. 500 companies. Same pipeline.
Enjoying this?
One email per week. Research, frameworks, and data on AI visibility and enterprise marketing.
No spam. Unsubscribe anytime.
Frequently Asked Questions

Marketing Manager, Enterprise & Automation. Publishes original research on AI visibility and enterprise marketing at GTM Signal Studio. Author of the AI Visibility Benchmark 2026 (50 enterprise companies scored) and the AI Visibility Framework.
Is AI recommending your company?
Scored across 4 dimensions. Prioritised fix list. 48-hour delivery.