Knowledge graphs · Ontology · GraphRAG · Embeddings

After the knowledge graph.

Ontology-grounded LLMs, GraphRAG, and statistical discovery. The operational sequel.

Damien Huzard, PhD · Neuronautix · 27 May 2026 · 12 min

A knowledge graph is not an endpoint.

Where we left off

May 18: structure first, generate last.

Today: you have the graph. What earns its place on top of it — and what does not?

Definitions

Four objects. Often one word.

Graph database

Storage + query. Neo4j, Memgraph, Kùzu, Jena Fuseki, Oxigraph.

Knowledge graph

Identifiable entities and assertions, with provenance and controlled semantics.

Ontology

Classes, relations, constraints. RDF/OWL/SKOS + SHACL validation.

GraphRAG index

Application retrieval structure: entities, communities, summaries. Not a validated ontology.

The common failure

LLM-extracted graph, stored as canonical knowledge.

Duplicated entities, no normalisation

Invalid relation types, no validation

No provenance, no status field, no audit

LLM output is a candidate proposal — not ground truth

Architecture

Three layers. Don't merge them.

Evidence

Documents, passages, datasets. Immutable identifiers (DOI/PMID), passage-level provenance, evidence grading, human review.

Ontology

Classes, relations, constraints. RDF/OWL/SKOS + SHACL validation, controlled vocabularies, versioning.

Application

GraphRAG summaries, embeddings, link prediction. Benchmarking, confidence reporting, audit trail.

candidate | validated | rejected | disputed

The single most useful field in the whole graph.

GraphRAG · Calibration

Complementary. Not superior.

Vanilla RAG tends to win

Single-hop factual retrieval. Detailed information lookup. When the answer is one well-indexed passage.

GraphRAG tends to win

Multi-hop reasoning. Corpus-wide sensemaking. When the answer requires traversing relationships.

The honest read

Recent benchmarks find graph-based retrieval can underperform vanilla RAG on real-world tasks. Graph use is empirically justified, not assumed.

The family

Four patterns. Different jobs.

Microsoft GraphRAG

Entity graph + community summaries. Global / Local / DRIFT search. Sensemaking over large corpora.

LightRAG

Dual-level graph + vector retrieval. Incremental updates. Faster evolving corpora.

HippoRAG

KG-like memory + Personalized PageRank. Multi-hop associative retrieval.

OG-RAG

Ontology-grounded retrieval. Minimal, conceptually grounded context. Preprint stage.

Clinical case study · DR.KNOWS

KG paths can help. When path quality is good.

UMLS-derived KG paths improved diagnostic prediction with LLMs. Irrelevant or contradictory paths impaired it. Path ranking and provenance are first-class requirements — not afterthoughts.

Generalising the lesson

Any "retrieve then reason" architecture inherits this property. Bad retrieval is worse than no retrieval, because it confidently misroutes the model.

Pick by question, not by trend

No universal graph database.

Neo4j + GraphRAG

Interactive apps, hybrid retrieval, graph-assisted LLM systems. Formal ontology semantics via neosemantics bridge.

Jena Fuseki / Oxigraph

Standards-native RDF/SPARQL. Canonical scientific ontology + FAIR linked-data layer.

Kùzu

Embedded property graph. Desktop or single-project analytical prototypes.

Memgraph / Apache AGE

Real-time operational graph; graph extension for PostgreSQL-centric systems.

Embeddings · Link prediction

Predicted edges are ranked hypotheses.

KGE baselines

TransE, ComplEx, RotatE. Bilinear / rotational scoring. PyKEEN, DGL-KE for reproducible experiments.

Text + graph fusion

FuseLinker — LLM-derived text + GNN + link prediction outperforms either alone for biomedical KG completion.

The consumer

Literature retrieval. Experimental review. Curator triage. Not the canonical evidence graph.

Ontology evolution

Controlled loop. Not free-form generation.

LLM proposes

Candidate mappings with rationale.

Ontology filters

Structural impossibilities removed.

Embedding ranks

OWL2Vec, OntoAligner, LLMs4OM.

Human validates

High-impact mappings only.

Rule reused

Accepted mapping joins the library.

Why precision over recall

For ontology maintenance, one invalid mapping contaminates downstream queries. High top-rank precision matters more than high recall.

Before any custom model

Four experiments. In this order.

1. Retrieval

Vector RAG vs ontology-filtered vs graph traversal vs hybrid GraphRAG. 50–100 competency questions. Single-hop vs multi-hop separated.

2. Extraction

Unconstrained LLM triples vs ontology-constrained + SHACL. Measure relation precision, SHACL violation rate, curator burden.

3. Embeddings

Text-only vs KGE-only vs fused vs ontology-aware fused. Temporal split for discovery tasks.

4. Ontology evolution

DeepOnto / LLMs4OM / OntoAligner on curated mappings + hard negatives. Top-rank precision, not recall.

Temporal split. Always.

Random splits leak future information into model development and exaggerate real-world performance.

It depends on the project

Two legitimate priority orders.

Scientific ontology (HCMO / MBO / NAMO)

Ontology and evidence quality first. Interoperable semantic representation. Validated retrieval. GraphRAG and prediction last. Canonical layer in RDF/OWL/SHACL, Neo4j as application projection.

Discovery corpus (founder docs, org intel)

Rapid corpus integration. Traceable exploratory retrieval. Graph-based synthesis. Ontology consolidation over time. Neo4j as principal operational graph, RDF/SHACL export later.

The takeaway

The KG is the substrate. Not the answer.

Distinguish DB, KG, ontology, and GraphRAG

Separate evidence, ontology, and application

Keep model output behind a status field

Calibrate GraphRAG against vanilla RAG

Read predicted edges as ranked hypotheses

Benchmark before you build custom models

Structure first. Generate last.

Then prove the graph earned its place.

Damien Huzard, PhD · Neuronautix · LIRMM ontology-constrained LLM navigation workstream
Companion note: neuronautix.com/notes/2026-05-after-the-knowledge-graph
neuronautix.com/contact · metadatapp.net