Back
Abril 8, 2026

JINAI — Production Multi-Agent AI Command Center


Architected from zero as the operational nervous system of a €10M+ B2B flooring enterprise. Production-grade. On-premise first. GDPR-native. 16/16 CI gates passing.

25h
Manual entry / week saved
95%
Faster order processing
26% → 6%
Hallucination rate
420ms
P90 Latency
92%
Intent accuracy
Zero
PII cloud egress

The Problem

A 20-person commercial team managed two national premium flooring brands (ARKITEK, SKUBA), 500+ B2B distributors, and €10M+ annual revenue across three disconnected systems: a legacy on-premise ERP, a cloud CRM, and two WooCommerce storefronts. Every order required triple manual re-entry. Price changes propagated via 5-person phone chains. Technical queries bounced through 3-day email threads. No automation. No single source of truth.

The Solution

JINAI is a production-grade multi-agent AI platform that became the single operational interface for the entire business — orchestrating inventory, orders, pricing, technical support, and marketing through a unified natural-language interface.

StockAgent
Real-time PHC stock + size/weight + accessories
CommercialAgent
Pipeline, seller performance, forecasting
EncomendasAgent
PDF auto-processing, dry-run, validation
ClientesAgent
Client lookup, credit, order history
PrecosAgent
Real-time discount parsing + margin quoting
SuporteAgent
Sub-3s RAG from technical manual library
EmailAgent
IMAP triage, filtering, LLM summarization
SyncAgent
Real-time CDC: ERP → CRM (52 fields)
MetaAgent
Meta Pages, Ads, Campaigns insights

MCP Integration Layer

All external system access is mediated through a Model Context Protocol (MCP) gateway — a standardised, tool-callable interface that decouples AI agents from API schemas, auth flows, and data formats. Adding a new integration requires exposing a new MCP tool, not rewriting agent behaviour.

PHC ERP
Zoho Bigin CRM
WooCommerce
Meta Business API
Google Ads (ready)
Qdrant RAG
Email IMAP
WhatsApp (ready)
Instagram (ready)

GDPR & EU AI Act Compliance

A deterministic sensitivity classifier intercepts every query before any LLM call. PII, financial data, and internal margins are hard-routed to Ollama (Llama 3.1) on-premise — zero cloud egress. Public queries cascade through a multi-provider AI Gateway with circuit breakers, automatic failover, and cost tracking per query. Dual audit trails: structured logs + human-readable knowledge vault. Article 12 compliant.

Hallucination as an Engineering Problem

Four compounding layers reduced hallucination from 26% to 6%:

  1. Corrective RAG (CRAG) — retrieves past errors before generation; semantic reranking by cosine similarity
  2. Episodic Memory — user preferences and corrections persist across sessions
  3. Bidirectional Guardrails — input sanitization (OWASP LLM01) + output numeric validation against PHC raw data
  4. Nightly RAGAS Evaluation — faithfulness, relevancy, precision metrics against golden dataset; 12 red-team tests on every CI push

Architecture Enforced by Code

Five fitness functions run as hard CI gates on every commit:

  • No agent may import a database driver directly — AST-validated
  • No agent may couple to more than three others — import graph analysis
  • @traced decorator coverage >50% on sensitive modules
  • All client-facing agents must reference PII guardrails
  • All ADR decisions must have implementation evidence

16/16 passing. Every deploy. Every time.

Business Impact

Orders
15–25 min → <1 min per order. PDF upload → OCR → price validation → dry-run preview → PHC creation. Human-in-the-loop at every critical step.
Stock
5-person daily phone chain replaced by predictive depletion dashboard — 6-month moving average with seasonality detection, tracked in MLflow.
Support
3-day email chains → sub-3-second RAG answers from indexed product manuals. Cited, page-referenced answers.
Claims
Telegram bot receives damage photos → Gemini 1.5 Flash vision analysis → auto-creates CRM deal → Human-in-the-Loop approval. 3-day chain eliminated.

Production SLOs

Metric Target Actual Status
P90 Latency <3000ms ~420ms
TTFT <200ms ~108ms
MTTR <5min <1min
Intent Accuracy >90% 92%
Hallucination Rate <10% ~6%

Tech Stack

Backend: Python 3.12 · FastAPI · LangGraph · LangChain · Pydantic v2 · Asyncio
Data & AI: Redis · Qdrant (on-premise) · Ollama (Llama 3.1) · Gemini 1.5 Flash · DeepSeek · MLflow · RAGAS
Observability: OpenTelemetry · Jaeger · Langfuse · Structlog
Frontend: Next.js 14 · TypeScript · TailwindCSS · React Streaming
Infrastructure: Docker · Kubernetes · Terraform (AWS EKS) · GitHub Actions · MCP
Compliance: GDPR Article 12 · EU AI Act · OWASP LLM Top 10