Building Intelligent Knowledge Services with Strapi and Multi-Agent AI

A Technical Deep Dive into Transforming Scientific Content into an Accessible AI-Powered Platform

When Leiden University and the Dutch provincial government of Zuid-Holland approached us (Smartshore & Ability) with the challenge of making scientific research on noise pollution accessible to citizens, policymakers and professionals, we knew traditional CMS solutions wouldn't cut it. The Sound Inspiration Guide (https://geluidshinderaanpak.nl/) contained valuable, scientifically validated solutions for urban noise pollution, but was hindered by accessibility barriers, technical jargon, and a static HTML structure.

This is the story of how we used Strapi as the backbone of a multi-agent AI system that transforms inaccessible scientific knowledge into an intelligent, personalised knowledge service. More importantly, it's a blueprint for how headless CMS architecture enables AI applications that would be impossible with traditional, monolithic systems.

The Problem: When Knowledge Becomes a Liability

The original guide faced seven critical barriers:

Content locked behind authentication;
No semantic HTML structure or navigation;
Scientific jargon incomprehensible to practitioners;
Zero WCAG compliance;
No search functionality;
Static content requires developers for every update;
No context understanding or personalisation.

For a government organisation, these barriers weren't just inconvenient; they translated into measurable costs: high helpdesk pressure, wasted research investments, and missed societal impact. The question became: how do we transform this into a living, intelligent service?

Why Headless CMS Is Non-Negotiable for AI

Here's the critical insight we learned: AI applications require structured, machine-readable content with rich metadata. Traditional CMS platforms like WordPress are designed for direct website rendering; they mix content with presentation, output HTML, and lack semantic structure.

This is where Strapi's API-first architecture becomes essential:

1. RESTful API Out of the Box

Strapi automatically exposes your content via a RESTful API. No custom development, no HTML parsing nightmares. Our AI pipeline can pull structured JSON directly from Strapi with a simple GET request:

javascript

// Clean, structured data ready for AI processing
const response = await fetch('https://cms.example.com/api/articles?populate=*');
const articles = await response.json();

2. Rich Metadata Structure

We modelled content in Strapi with fields specifically designed for AI consumption:

Title and content (obvious);
Author and publication date (for credibility signals);
Category and tags (for semantic classification);
Target audience markers (citizen, policymaker, architect, scientist);
Source citations and references.

This metadata isn't just a nice-to-have. It's critical for our multi-agent system to make intelligent routing decisions.

3. Content Versioning and Audit Trail

Government applications demand complete traceability. Strapi's built-in versioning lets us track every content change, which is essential for compliance and debugging AI responses.

The Architecture: From Strapi to Intelligent Answers

Our solution uses Strapi as the single source of truth in a five-layer architecture:

Layer 1: Content Management (Strapi)

Non-technical scientists and policymakers manage content in Strapi's intuitive editor. No developer involvement needed for content updates, critical for keeping scientific knowledge current.

Layer 2: Automated Data Pipeline

This is where Strapi's API shines. We built a fully automated pipeline that:

// Simplified pipelineflow
async function syncContentToAI() {
  // 1. Ophalen uit Strapi API
  const articles = await fetchFromStrapi();
  
  // 2. Transformeren naar AI-klaar formaat
  const chunks = articles.flatMap(article =>
    chunkArticle(article, {
      maxChunkSize: 500,
      preserveMetadata: true
    })
  );
  
  // 3. Embeddings genereren
  const embeddings = await generateEmbeddings(chunks);
  
  // 4. Opslaan in vectordatabase
  await pinecone.upsert(embeddings);
}

The Chunking Strategy deserves special attention. Large documents confuse AI models. Our solution: intelligent chunking that splits articles into semantic units (paragraphs, sections) while preserving metadata. Each chunk retains its source URL, author, and category, which are essential for citations and credibility.

Why this matters: when a user asks about noise insulation, the AI retrieves only the 3-5 most relevant chunks instead of processing entire documents. This improves accuracy, reduces costs, and enables precise citations.

Layer 3: Vector Database (Pinecone)

Chunks are transformed into vector embeddings and stored in Pinecone for semantic search. Unlike keyword search, semantic search understands intent:

Question: ""How can I prevent my new housing development from being too noisy?"
Traditional search: Looks for exact matches of "housing development" and "noisy"
Semantic search: Understands the intent and retrieves content about sound insulation, urban design, green buffers, even if these exact terms aren't in the question.

Layer 4: Multi-Agent AI (Flowise)

Here's where it gets interesting. Instead of a monolithic AI, we built a multi-agent system where five specialised agents collaborate:

Agent 1: Scope Decision - Determines if the question falls within the guide's domain. Prevents hallucination by refusing out-of-scope questions.

Agent 2: Out-of-Scope Handler - Provides helpful redirects when questions fall outside the scope, maintaining positive UX.

Agent 3: Vector Query - Performs a semantic search on Pinecone to retrieve relevant content chunks.

Agent 4: Audience Analysis - Analyzes language patterns to identify user type (citizen, policymaker, architect, scientist) without requiring login or explicit profiles.

Agent 5: Answer Formulation - Synthesizes inputs into personalised answers that:

Match the audience's expertise level;
Include source citations linking back to Strapi content;
Explicitly state audience assumptions ("This answer is aimed at a policymaker...").

Layer 5: WCAG-Compliant Frontend

The final layer is a fully accessible frontend that meets WCAG 2.1 AA standards, ensuring the service is usable by everyone, including people with disabilities.

The Strapi Content Model: Optimising for AI

Our Strapi content structure for AI processing:

// Vereenvoudigd Strapi content type
{
  "article": {
    "title": "String",
    "content": "RichText",
    "author": "Relation",
    "publishedAt": "DateTime",
    "category": "Enumeration",
    "tags": "Relation (many)",
    "targetAudience": "Enumeration (multiple)",
    "scientificReferences": "Component (repeatable)",
    "metaDescription": "Text",
    "seoKeywords": "Text"
  }
}

Key Decisions:

Rich text for content (preserves semantic structure);
Enumerated categories (prevents inconsistency);
Many-to-many tag relations (flexible classification);
Target audience markers (enables personalisation);
Separate scientific references (for citation integrity).

Results: From Science to Impact

The results validate our architecture:

Operational Efficiency:

Content updates propagate to AI within minutes (previously: weeks of dev time)
40-60% reduction in helpdesk routine questions
Zero synchronisation issues (single source of truth)

User Experience:

100% WCAG 2.1 AA compliance
Automatic language adaptation based on user profile
Complete transparency with source citations
Real-time content currency

Strategic Value:

Scientific knowledge that was inaccessible for years is now actionable
Scalable blueprint for other knowledge domains (air quality, energy, healthcare)
Reusable infrastructure dramatically reduces time-to-market

Key Takeaways for Strapi Developers

If you're building AI-powered knowledge services, here's what we learned:

1. Headless CMS Isn't Optional. It's Essential! AI applications require API-first access to structured content. Strapi's architecture makes this trivial.

2. Design Content Models for Machines AND Humans Think beyond website rendering. Structure your content types with rich metadata that enables intelligent AI decisions.

3. Embrace the Pipeline Mindset Your CMS is the start of a data pipeline, not the end. Design with automated sync in mind.

4. Single Source of Truth Scales One centralised content repository (Strapi) feeding multiple frontends (web, AI, mobile) prevents the synchronisation nightmares that plague traditional architectures.

5. Multi-Agent > Monolithic Specialised agents with clear responsibilities produce more reliable, maintainable systems than monolithic AI solutions.

What's Next?

This architecture isn't sound-specific. The same blueprint applies to any domain where complex knowledge must be accessible: policy documentation, medical guidelines, technical specifications and educational content.

If you're sitting on valuable content that's trapped in inaccessible formats, consider this: Strapi's headless architecture doesn't just enable websites. It enables intelligent, personalised knowledge services that create real societal impact.

The future of content isn't just about publishing. It's about making knowledge truly accessible through AI. And that future is built on API-first, headless CMS platforms like Strapi.

How Smartshore & Ability built an intelligent knowledge service with Strapi and multi-agent AI

A Technical Deep Dive into Transforming Scientific Content into an Accessible AI-Powered Platform

The Problem: When Knowledge Becomes a Liability

Why Headless CMS Is Non-Negotiable for AI

The Architecture: From Strapi to Intelligent Answers

Layer 1: Content Management (Strapi)

Layer 2: Automated Data Pipeline

Layer 3: Vector Database (Pinecone)

Layer 4: Multi-Agent AI (Flowise)

Layer 5: WCAG-Compliant Frontend

The Strapi Content Model: Optimising for AI

Results: From Science to Impact

Key Takeaways for Strapi Developers

What's Next?

What can we do for your AI case?

Contact Remko

More Smartshore blog

Utrecht

Panaji

Ludhiana

Voorburg