Neuronautix ← All presentations

Metadata · FAIR · NAMs · 27 May 2026

Metadata as a NAM. From data context to virtual control groups.

Damien Huzard, PhD · Neuronautix · 33 min

The library is on the floor.

Act 1 — The problem

Data without structure is not knowledge

A scattered library contains the same information as a catalogued one — yet the knowledge is unreachable. Preclinical datasets face this problem at every scale, from a single lab to a multi-site consortium.

Accessibility requires structure

Even if the answer exists, it remains inaccessible without organisation

The data is there. The result was recorded. But without a way to find it, sort it, or compare it — it cannot be used. Structure is not presentation; it is access.

Metadata examples — a book

Four fields that make a book findable

Title The name — makes the item identifiable in a search
Author Provenance — who produced it, and when
Publication date Temporal context — when it was created, which edition
ISBN Persistent identifier — globally unique, machine-resolvable

Animal data analogy

Animals on shelves. Grouped by what makes them comparable.

Unsorted cohort

Mixed strains, undocumented

Housing conditions unrecorded

Procedures undated

Operator identity missing

Metadata-sorted cohort

Strain + sex + age at treatment

Housing: cage type, density, rack position

Procedure: SOP version + date

Operator ID documented

Metadata makes evidence reusable

The fields that define comparability

Animal Species, strain, sex, age, weight, genotype
Cage Cage type, housing density, rack position, bedding
Assay Protocol name, SOP version, apparatus model, parameters
Operator Experimenter identity, training record, blinding status
Date Procedure date, time of day, light-phase, season

Outcome + context = evidence.

Act 2 — The turn

Missing context makes interpretation fragile

A behavioral result without documented conditions — experimenter, protocol version, housing, time of year — cannot be replicated, compared, or included in a meta-analysis. The number is real. The evidence is not.

The same result — with and without provenance

Context + DOI = defensible evidence

Result without context

Latency to find platform: 42 s

Group difference: p = 0.03

n = 10 per group

No strain, no operator, no date

Result with provenance

C57BL/6J, male, 12 weeks — DOB documented

MWM v3.2, 25 °C water, 60 s ceiling

Experimenter A, blinded; ZT7–10

DOI: 10.xxxx/study.2026 · CC BY 4.0

Data versus metadata

The object and its descriptive map

Data

The measurement: values, signals, waveforms

Raw sensor readings: 2.4 mg/mL, 142 BPM, 18 cm/s

Images, video frames, time-series arrays

Without context: a number with no address

Metadata

What the data is about — and under what conditions

Who collected it, how, when, on whom, with what

Strain, SOP, device, version, provenance, license

With metadata: evidence that travels and accumulates

Wilkinson et al. — Scientific Data 2016

The FAIR Guiding Principles

Four principles now central to funding body requirements, journal data policies, and reproducibility initiatives in preclinical research. Originally articulated for research data management broadly — now the standard for any dataset intended to outlast its originating experiment.

doi:10.1038/sdata.2016.18

FAIR components

Four properties. One goal.

Findable Globally unique persistent identifier; machine-readable metadata; indexed in a searchable repository
Accessible Retrievable using a standard protocol; access conditions defined and documented
Interoperable Uses shared vocabularies and ontologies enabling integration with other datasets
Reusable Clear provenance, a license, and sufficient documentation for independent reuse without contacting the original team

The standard

FAIR. The framework for data that outlasts the experiment.

Act 3 — Evidence

Metadata must be planned before the experiment, not reconstructed after

Post-hoc curation fails. By the time a dataset is analysed, experimental context has been overwritten, forgotten, or lost. FAIR must be designed into the protocol — at the same time as the hypothesis.

Ambition levels

Reporting is the floor. Reuse is the ceiling.

Reporting — ARRIVE 2.0 minimum

Species, strain, sex, age, n per group

Housing and husbandry conditions

Procedure description and timeline

Sufficient to publish — insufficient to reuse

Reuse target — FAIR

Persistent identifiers + machine-readable schema

Controlled vocabularies and ontology terms

Full provenance chain, versioned protocols

License, DOI, deposited in searchable repository

Unusable data wastes animal lives.

Every experiment that cannot be reused required animals whose contribution is lost to science.

Introducing WellFAIR

From FAIR data to a WellFAIR research ecosystem

WellFAIR frames data stewardship as an ethical obligation, not a compliance requirement. Data welfare and animal welfare are the same problem — because data that cannot be reused means more animals must be used in the next study.

The ethical argument

Data welfare is animal welfare. FAIR principles overlap with the 3Rs.

The 3Rs

Replacement — avoid animal use where alternatives exist

Reduction — minimise animal numbers per study

Refinement — minimise suffering per animal

FAIR counterpart

Reusable historical data enables virtual replacement

Findable historical controls reduce repeat experiments

Interoperable evidence enables cross-study meta-analysis

Research article

WellFAIR — the paper

Petit-Demoulière & Huzard · Neuroscience Applied · 2026

"Data welfare is animal welfare: Building a WellFAIR research ecosystem." The paper argues that FAIR-by-design data stewardship is both an ethical obligation and a technically achievable standard for preclinical neuroscience.

WellFAIR workflow

Four stages that connect intention to reusable evidence

Planning Define the metadata schema before the first animal enters the study — not after the last data file is collected
Acquisition Capture context at source: instrument metadata, operator identity, timestamped procedures, cage conditions
Federation Standardise across studies and sites using shared vocabularies and persistent identifiers
Outcomes Reusable evidence: queryable, citable, VCG-eligible, AI-actionable, and defensible in regulatory review

Reference — Home Cage Monitoring

Metadata requirements for HCM data reuse

Strain, sex, age, housing conditions, device model and firmware version, cohort ID, operator identity, cage position in rack, light cycle timing — the fields that enable comparability across studies and sites. Documented in the Home Cage Monitoring metadata literature and the Neuronautix knowledge base.

Metadata gaps → 3Rs impact

Missing metadata flows to wasted evidence

Common metadata gaps

Strain and genetic background undocumented

Operator identity and blinding unrecorded

Cage position in rack missing

Housing density and bedding not logged

Device firmware version absent

3Rs consequence

Replacement blocked — data not cross-comparable

Reduction blocked — VCG eligibility fails matching

Refinement missed — rack confound uncontrolled

Refinement missed — welfare confound invisible

Replacement blocked — reproducibility gap

Metadata is the infrastructure for NAMs.

Act 4 — What metadata protects

Method. Evidence. Animals.

Method

Versioned protocols and SOP records make the procedure reproducible — in the same lab, across sites, and years later.

Evidence

Provenance-tagged results travel with context — making them queryable, citable, and eligible for regulatory review.

Animals

Reusable evidence reduces repeat experiments. Every dataset that enables reuse means fewer animals in the next study.

Virtual control groups

Historical controls filtered by metadata into valid comparator pools

A virtual control group is not historical data — it is historical data whose comparability has been verified through metadata completeness, contextual matching, and statistical correction. EMA (2023) positions VCGs as innovative NAMs supporting the 3Rs.

Eligibility dimensions

Ten variables that define a valid comparator

Strain

Age at treatment

Sex

Room and rack position

Welfare indicators

Assay and protocol version

Diet and water source

Cage type and housing density

Time window — season, batch

Operator identity

The AI argument

AI cannot recover what was never captured

Language models and machine learning pipelines amplify available context — they cannot fabricate experimental conditions that were never recorded. Missing metadata is an irreversible information loss. Garbage in, garbage out remains true.

DVC® — Tecniplast · Neuronautix

Continuous home-cage context from the cage hardware

The Digital Ventilated Cage integrates RFID tracking, load cells, and sensor arrays to capture locomotion, feeding, and individual presence data automatically — providing continuous, structured metadata alongside the behavioral record. Neuronautix is a listed scientific partner of Tecniplast.

A critical distinction

Raw signal must be contextualised into structured descriptors

Data — the raw signal

Sensor readings: load cell voltage, RFID pulses

Video frames, time-series arrays

Values without interpretation

Not reusable without the frame that defines them

Metadata — the structured context

Animal ID, strain, sex, age, cohort

Cage type, rack position, housing density

Device model, firmware, calibration date

Protocol SOP, experimenter, light phase

Close — Multi-site federation

Heterogeneous datasets, unified by metadata

Site A — CRO

C57BL/6J · Male · 10 weeks
DVC® rack 3 · SOP v2.1
Operator B · 2024-Q1

Site B — Academic lab

C57BL/6J · Male · 10 weeks
LMT system · SOP v2.1
Operator C · 2024-Q3

Site C — Pharma

C57BL/6J · Male · 10 weeks
DOME cage · SOP v2.1
Operator D · 2025-Q1

Shared metadata labels — strain, sex, age, SOP version — enable cross-site comparability and VCG pool construction despite heterogeneous instrumentation.

Implementation — Metadatapp

A practical interface for structured metadata capture

Metadatapp provides the data-entry layer — schema-driven, controlled-vocabulary backed — that turns the WellFAIR workflow into daily practice for preclinical teams. Structured capture at source, from protocol design to file deposit.

Thanks.

Neuronautix · Tecniplast · Metadatapp · Anibio · eLabFTW · SoftMouse · LIRMM / Dr. Konstantin Todorov · Olden Labs · Metofico · Pistoia Alliance

Damien Huzard, PhD · Neuronautix · 27 May 2026
neuronautix.com/contact