Neuronautix ← All presentations

LLMs · Knowledge graphs · FAIR metadata

Why scientific LLM workflows need metadata, ontologies, and graphs.

Not just longer context.

Damien Huzard, PhD · Neuronautix · 18 May 2026 · 12 min

More context. Less understanding.

The anti-pattern

When the LLM underperforms — give it more documents.

The universal-LLM-reader reflex. More tokens, more retrieval, more context — and still the same brittle output.

Long context

Long context is not understanding.

Relevant content lost in the middle
Few models hold accuracy past 64k tokens
RAG retains a clear cost advantage
"Just send everything" is technically weak

Retrieval architecture

RAG retrieves chunks. Not relationships.

Chunk-only retrieval

Top-k passages by similarity. Isolated fragments. No entity relationships. Reassembly is the LLM's job.

Graph-guided retrieval

Entities, relations, paths. Coherent multi-hop context. Relationships preserved. An architecture, not a single tool.

Ontologies = semantic compression.

Ontology-grounded retrieval

Four reasons it wins.

Concept
Fixed identifiers replace synonyms and free text. The model resolves to a known term, not a guess.
Relation
Typed edges replace prose like "is associated with". Graph queries become possible.
Constraint
Units, ranges, required fields become executable. Validation is deterministic, not generative.
Provenance
Every term has a source and a definition. Every claim has a path back to evidence.

Hybrid architecture

The pipeline. Five steps.

Schema

Required fields, types, units — machine-actionable.

Capture

Structured forms, importers, ontology-based suggestions.

Graph

Entities, relations, provenance packages.

Retrieval

Ontology-grounded, KG-guided. Minimal grounded context.

LLM

Synthesis only at this step. Last, not first.

Reframe

Metadata is not paperwork.

It is machine-actionable infrastructure. FAIR requires rich, domain-specific, machine-readable templates — not narrative documentation.

Three patterns

Constraints reduce burden. They do not increase it.

Templates

CEDAR Embeddable Editor — author once, publish everywhere. Templates live inside the platform that needs them.

Packages

RO-Crate — research artefacts travel with JSON-LD metadata, identifiers, provenance, relations, annotations.

Recommendations

Ontology-based field suggestions accelerate authoring and improve accuracy at data-entry time.

Biomedical KGs · Today

This is not theoretical.

MedGraph

PubMed entities · MeSH terms · citations · grants · authors → semantic biomedical retrieval

PubMed KG 2.0

Papers · patents · clinical trials · biomedical entities · author networks · project metadata

Life-sciences KG ecosystem

Ontologies + heterogeneous biomedical data → AI-powered research substrate

Real-world data graphs

Graph data models for heterogeneous clinical and research data — new analyses become tractable

Inference economics

Token economy is energy economy.

Output length
Drives energy
Inference energy correlates with output token length and response time.
Reasoning depth
Has a cost
Emissions scale with model size and reasoning behaviour across 14 LLMs.
Generality
Is expensive
General-purpose generative AI can be orders of magnitude more energy-expensive than task-specific systems for many tasks.
Inference
Compounds
Cumulative inference cost can become comparable to or exceed training cost.

Routing

Route the work. Don't generate it all.

Validators do validation
Graph queries do retrieval
Smaller models do mapping
LLMs do synthesis — only when generative reasoning is required

Energy-per-token as a benchmark

Energy-per-token should complement accuracy benchmarks. Model selection and reasoning depth become routing decisions, not defaults.

Calibration

What graphs do not solve.

Hallucination
Graphs and ontologies reduce risk by grounding retrieval and constraining valid relations. They do not eliminate hallucination by themselves.
Cost
GraphRAG is not always cheaper than long-context LLMs. Graph construction, maintenance, and query design also have costs.
Long context
Long-context models are not always wrong. Sometimes the right routing decision is to use them.
Token prices
Unit prices fluctuate. What scales poorly is total tokens, inference calls, energy, latency, and review burden.

Human-in-the-loop

Place humans upstream and selectively.

Where humans add value

Define concepts and constraints. Validate ontology extensions. Approve high-impact KG changes. Resolve ambiguity at validation gates.

Where humans become a bottleneck

Manually correcting every output. Reviewing routine extractions. Acting as the only validator for deterministic checks. Re-typing what the schema already captures.

The right architecture is hybrid.

Reference pipeline

Five technical steps. One governance lane.

Schema — community-defined, machine-actionable templates
Capture — structured forms, importers, ontology-based suggestions
Graph — entities, relations, provenance packages
Retrieval — ontology-grounded, KG-guided, minimal context
Synthesis — LLM at the last step, routed by energy and accuracy
Review — humans at ontology, validation, and approval gates

Preclinical · NAM evidence

What this means for the bench.

Schema-first
Minimal mandatory metadata set, ARRIVE 2.0 anchored. Enforced at source, not after the study.
Ontology-grounded
NCBITaxon, UBERON, OBI, ChEBI. Controlled vocabularies are how cross-lab comparison becomes possible.
Provenance-packaged
RO-Crate, FAIRSCAPE, BioCompute. Datasets travel with their context and computational history.
Hybrid synthesis
Schema-first agents bounded by deterministic validation. LLMs accelerate curation; they do not decide what is recorded.

The takeaway

Use LLMs where they help. Use structure where they hurt.

Long context is not understanding
RAG without relationships is weak
Ontologies are semantic compression
Metadata is machine-actionable infrastructure
Token economy is energy economy
Humans go upstream — at gates, not on outputs

Structure first. Generate last.

Make the schema explicit.
Make the graph queryable.
Make the human review valuable.

Damien Huzard, PhD · Neuronautix
neuronautix.com/contact  ·  metadatapp.net