Build Note · AI Visibility & RAG

Building DigitalScore: from AI visibility data to a live RAG demo

The commercial blueprint: Standard out-of-the-box semantic RAG apps often fail when confronted with precise enterprise data. This demo showcases a custom routed hybrid QA engine designed to track AEO/GEO visibility. By using an upfront intent classifier, the system bypasses fuzzy vector search for structured data questions and routes them to validated evidence tables instead. The result is a more deterministic, evidence-backed reporting layer for B2B SaaS and enterprise brands — reducing hallucination risk and making answers traceable to source data.

I recently built and deployed a live AI visibility intelligence demo for the CRM software market. The goal was to move beyond a static SEO audit or slide deck and build a workflow that could collect, structure, analyse and query visibility data across both Google and AI answer environments.

Live demo: https://demo.digitalscore.co.uk

Main site: https://digitalscore.co.uk

What I built

DigitalScore analyses how brands appear across search, AI answers and third-party influence sources. For this first demo, I focused on the CRM software market.

The workflow brings together Google search results, LLM citation data, third-party comparison sites, review and directory sources, brand mentions, crawled page content, sentiment evidence and URL-level overlap data.

The output is not just a dashboard or chatbot. It is an evidence-backed workflow designed to answer commercial questions.

Which URLs appear in both Google results and LLM citation data?
Which third-party domains matter most for AI visibility?
What does the data say about a specific brand like HubSpot?
Which CRM brands show mixed or negative sentiment?
Which sources should be prioritised for visibility, reputation or outreach work?

Why I built it

Search behaviour is changing. Brands are no longer only competing for rankings on their own website. They are being described, compared, cited and summarised across AI answers, third-party review sites, comparison pages, communities, publisher lists and crawled content.

That means visibility work needs to look beyond traditional SEO data. The question is no longer only: “Where do we rank?” It is also: “Where are we cited?”, “Who influences the answer?”, “Which third-party sources shape perception?” and “Where are we absent, misrepresented or weak compared to competitors?”

The data workflow

The project started with market data collection and analysis. I collected and structured data across SERPs, LLM citations, third-party sources, brand mentions, crawled content, sentiment evidence and URL overlap.

I then created Python workflows to clean, combine and analyse the data into usable evidence tables. These included SERP and LLM URL overlap, AI citation summaries, brand sentiment summaries, source priority reports, brand gap analysis, recommendation outputs, crawled page evidence and QA-ready evidence files.

The first output was a client-facing AI visibility deck showing where CRM brands appear, where they disappear, and which third-party sources influence visibility across search and AI.

The RAG and routed QA layer

After building the analysis deck, I extended the project into a live queryable demo. The demo uses a routed QA approach. Instead of sending every question through generic retrieval, structured questions are routed to the correct evidence table first.

SERP/LLM overlap questions use the overlap file.
Brand questions use brand sentiment and mention evidence.
Mixed or negative sentiment questions use sentiment evidence.
Source priority questions use the source priority report.
Broader questions can fall back to the RAG layer.

This matters because not every business question should be answered by semantic search alone. Some questions need the exact table, exact filter and exact evidence source.

Deployment stack

I deployed the demo publicly on my own infrastructure using:

Python
Streamlit
Chroma
GitHub
DigitalOcean VPS
Nginx
Cloudflare DNS
systemd
SSL
CSV-based evidence tables
Routed QA logic
Live QA tracking

Technical problems solved

The deployment surfaced real production issues, not just local development problems.

GitHub private repository authentication and token access.
Separating code deployment from data deployment.
Missing data files on the VPS causing empty answers.
Missing Chroma vector database on the live server.
The live service initially running the wrong app file.
Local versus live data-source mismatches.
Route-level QA problems where the app needed to use the exact validated evidence table.

Live QA results

The live demo now successfully answers routed questions such as:

Which URLs appear in both SERP and LLM citation data?
What does the data say about HubSpot?
Which CRM brands have mixed or negative sentiment?

The answers return relevant data and supporting evidence tables, making the demo useful for market analysis, AI visibility audits and client-facing discovery.

What I would improve next

This is still a prototype, not a finished product. Next improvements include:

Better UI formatting.
Cleaner evidence summaries.
Deduplication of repeated rows.
More automated QA tests.
Clearer source labels.
More robust route validation.
Category expansion beyond CRM.
A client-facing reporting layer.
More automation across data refresh, crawl updates and recommendations.

Why this matters

This project proves the full workflow from data collection to live deployment. It combines SEO, AI visibility, data processing, market intelligence, RAG-style querying and infrastructure deployment.

More importantly, it shows how visibility analysis needs to evolve. Brands need to understand not only where they rank, but where they are cited, compared, excluded, summarised and influenced across AI and third-party sources.

Explore the live demo: demo.digitalscore.co.uk

Contact: rossouw.emile@digitalscore.co.uk