From RAG to Reality: Designing an Operational Ontology for AI Systems

Most AI systems today stop at retrieval.
They search documents, fetch chunks, prompt an LLM, and return text. This works — until it doesn’t.
As soon as you need trust, traceability, decisions, or multi-step reasoning, the cracks appear:
- hallucinations without accountability
- answers without provenance
- dashboards that can’t act
- agents that can’t be governed
At Polyvia, we realized the problem isn’t the model. It’s the missing semantic layer between data and action.
This post introduces the architecture we’re building: an operational ontology designed to power RAG, agents, and real decisions.
Why RAG alone is not enough
Retrieval-Augmented Generation (RAG) improves accuracy by grounding LLMs in documents. But classical RAG systems are still text-centric:
- chunks are retrieved
- prompts are assembled
- answers are generated
- context is discarded
What’s missing:
- no persistent notion of facts
- no explicit uncertainty
- no memory of why an answer was given
- no way to act on results safely
In short: RAG answers questions, but it doesn’t model reality.
The core idea: an operational ontology
An ontology is often misunderstood as an academic artifact or a static knowledge graph.
We use it differently.
An operational ontology is a contract between data, meaning, and action.
It defines:
- What exists (objects)
- How things relate (relationships)
- What is believed to be true (claims)
- Why we believe it (evidence)
- What can be done about it (actions)
This turns an AI system from a Q&A engine into a decision-capable platform.
The Polyvia Ontology v1 (high-level)
1. Objects, not tables
We model real-world nouns as first-class objects:
- Document – the canonical source (PDF, webpage, email)
- DocumentPart – pages, sections, tables, figures
- Entity – companies, people, regulations, systems
- Claim – atomic, checkable statements
- Evidence – precise support for claims
- Insight – synthesized conclusions
- WorkflowRun – traceability for AI and pipelines
Crucially:
One object is backed by many data sources — never 1:1 with storage.
2. Claims as the unit of truth
Instead of treating answers as text, we promote claims to first-class citizens.
A claim is:
- atomic (“Company X revenue was €12M in 2023”)
- time-bound
- confidence-scored
- explicitly supported by evidence
- traceable to humans or agents
Claims can be:
- proposed
- verified
- disputed
- retracted
This gives the system memory, uncertainty, and accountability.
3. Evidence beats chunks
Chunks are implementation details. Evidence is a semantic pointer.
Evidence always knows:
- where it comes from (page, table cell, figure)
- how strong it is
- what it supports or contradicts
LLMs never cite raw text — they cite claims backed by evidence.
This single shift dramatically reduces hallucinations.
4. State and time are non-optional
Real systems change.
Documents move from ingested → parsed → indexed. Claims evolve from proposed → verified → disputed.
State transitions are events, not overwrites.
This enables:
- audit trails
- review queues
- rollback
- temporal reasoning (“what did we believe last month?”)
Without state, you don’t have governance — just outputs.
5. Actions turn data into software
The biggest leap beyond traditional knowledge graphs is actions.
Objects expose actions such as:
- VerifyClaim
- DisputeClaim
- ReindexDocument
- ResolveEntity
- PublishInsight
Actions:
- have preconditions
- check permissions
- mutate state
- emit audit events
- can be triggered by humans or agents
This is where AI systems become operational, not just analytical.
Mapping the ontology to RAG
The ontology doesn’t replace RAG — it structures it.
Classical RAG
Query → retrieve chunks → prompt → answer
Ontology-driven RAG
Query
→ retrieve candidate objects
→ select evidence
→ propose or reuse claims
→ synthesize answer from claims
→ update ontology state
Retrieval still uses embeddings. But reasoning happens at the claim layer, not the text layer.
Agents in this architecture
Agents are no longer free-floating prompt loops.
Each agent has a role, permissions, and allowed outputs:
- IngestAgent – creates documents and parts
- ExtractAgent – proposes claims with evidence
- ResolveAgent – entity disambiguation
- VerifyAgent – strengthens or disputes claims
- AnswerAgent – synthesizes insights
- GovernanceAgent – enforces policy and redaction
Agents are constrained by the ontology:
If there’s no evidence, they must say “unknown”.
Why this matters for enterprise AI
This architecture enables things that pure RAG cannot:
- explainable answers
- auditability for regulators
- safe autonomous agents
- long-term memory
- decision tracking
- human-in-the-loop workflows
It’s how you go from:
“The model says…” to “The system believes this, based on these sources, with this confidence, and here’s what we can do next.”
Final thought
LLMs are powerful, but they are not systems.
Systems require:
- structure
- memory
- rules
- state
- accountability
An operational ontology is how we give AI those properties.
At Polyvia, this is the foundation we’re building on — not because it’s fancy, but because it’s the minimum structure required for AI to interact safely with reality.