Metadata · FAIR · NAMs · 27 May 2026
Damien Huzard, PhD · Neuronautix · 33 min
Act 1 — The problem
A scattered library contains the same information as a catalogued one — yet the knowledge is unreachable. Preclinical datasets face this problem at every scale, from a single lab to a multi-site consortium.
Accessibility requires structure
The data is there. The result was recorded. But without a way to find it, sort it, or compare it — it cannot be used. Structure is not presentation; it is access.
Metadata examples — a book
Animal data analogy
Mixed strains, undocumented
Housing conditions unrecorded
Procedures undated
Operator identity missing
Strain + sex + age at treatment
Housing: cage type, density, rack position
Procedure: SOP version + date
Operator ID documented
Metadata makes evidence reusable
Act 2 — The turn
A behavioral result without documented conditions — experimenter, protocol version, housing, time of year — cannot be replicated, compared, or included in a meta-analysis. The number is real. The evidence is not.
The same result — with and without provenance
Latency to find platform: 42 s
Group difference: p = 0.03
n = 10 per group
No strain, no operator, no date
C57BL/6J, male, 12 weeks — DOB documented
MWM v3.2, 25 °C water, 60 s ceiling
Experimenter A, blinded; ZT7–10
DOI: 10.xxxx/study.2026 · CC BY 4.0
Data versus metadata
The measurement: values, signals, waveforms
Raw sensor readings: 2.4 mg/mL, 142 BPM, 18 cm/s
Images, video frames, time-series arrays
Without context: a number with no address
What the data is about — and under what conditions
Who collected it, how, when, on whom, with what
Strain, SOP, device, version, provenance, license
With metadata: evidence that travels and accumulates
Wilkinson et al. — Scientific Data 2016
Four principles now central to funding body requirements, journal data policies, and reproducibility initiatives in preclinical research. Originally articulated for research data management broadly — now the standard for any dataset intended to outlast its originating experiment.
doi:10.1038/sdata.2016.18
FAIR components
The standard
Act 3 — Evidence
Post-hoc curation fails. By the time a dataset is analysed, experimental context has been overwritten, forgotten, or lost. FAIR must be designed into the protocol — at the same time as the hypothesis.
Ambition levels
Species, strain, sex, age, n per group
Housing and husbandry conditions
Procedure description and timeline
Sufficient to publish — insufficient to reuse
Persistent identifiers + machine-readable schema
Controlled vocabularies and ontology terms
Full provenance chain, versioned protocols
License, DOI, deposited in searchable repository
Every experiment that cannot be reused required animals whose contribution is lost to science.
Introducing WellFAIR
WellFAIR frames data stewardship as an ethical obligation, not a compliance requirement. Data welfare and animal welfare are the same problem — because data that cannot be reused means more animals must be used in the next study.
The ethical argument
Replacement — avoid animal use where alternatives exist
Reduction — minimise animal numbers per study
Refinement — minimise suffering per animal
Reusable historical data enables virtual replacement
Findable historical controls reduce repeat experiments
Interoperable evidence enables cross-study meta-analysis
Research article
"Data welfare is animal welfare: Building a WellFAIR research ecosystem." The paper argues that FAIR-by-design data stewardship is both an ethical obligation and a technically achievable standard for preclinical neuroscience.
WellFAIR workflow
Reference — Home Cage Monitoring
Strain, sex, age, housing conditions, device model and firmware version, cohort ID, operator identity, cage position in rack, light cycle timing — the fields that enable comparability across studies and sites. Documented in the Home Cage Monitoring metadata literature and the Neuronautix knowledge base.
Metadata gaps → 3Rs impact
Strain and genetic background undocumented
Operator identity and blinding unrecorded
Cage position in rack missing
Housing density and bedding not logged
Device firmware version absent
Replacement blocked — data not cross-comparable
Reduction blocked — VCG eligibility fails matching
Refinement missed — rack confound uncontrolled
Refinement missed — welfare confound invisible
Replacement blocked — reproducibility gap
Act 4 — What metadata protects
Versioned protocols and SOP records make the procedure reproducible — in the same lab, across sites, and years later.
Provenance-tagged results travel with context — making them queryable, citable, and eligible for regulatory review.
Reusable evidence reduces repeat experiments. Every dataset that enables reuse means fewer animals in the next study.
Virtual control groups
A virtual control group is not historical data — it is historical data whose comparability has been verified through metadata completeness, contextual matching, and statistical correction. EMA (2023) positions VCGs as innovative NAMs supporting the 3Rs.
Eligibility dimensions
Strain
Age at treatment
Sex
Room and rack position
Welfare indicators
Assay and protocol version
Diet and water source
Cage type and housing density
Time window — season, batch
Operator identity
The AI argument
Language models and machine learning pipelines amplify available context — they cannot fabricate experimental conditions that were never recorded. Missing metadata is an irreversible information loss. Garbage in, garbage out remains true.
DVC® — Tecniplast · Neuronautix
The Digital Ventilated Cage integrates RFID tracking, load cells, and sensor arrays to capture locomotion, feeding, and individual presence data automatically — providing continuous, structured metadata alongside the behavioral record. Neuronautix is a listed scientific partner of Tecniplast.
A critical distinction
Sensor readings: load cell voltage, RFID pulses
Video frames, time-series arrays
Values without interpretation
Not reusable without the frame that defines them
Animal ID, strain, sex, age, cohort
Cage type, rack position, housing density
Device model, firmware, calibration date
Protocol SOP, experimenter, light phase
Close — Multi-site federation
C57BL/6J · Male · 10 weeks
DVC® rack 3 · SOP v2.1
Operator B · 2024-Q1
C57BL/6J · Male · 10 weeks
LMT system · SOP v2.1
Operator C · 2024-Q3
C57BL/6J · Male · 10 weeks
DOME cage · SOP v2.1
Operator D · 2025-Q1
Shared metadata labels — strain, sex, age, SOP version — enable cross-site comparability and VCG pool construction despite heterogeneous instrumentation.
Implementation — Metadatapp
Metadatapp provides the data-entry layer — schema-driven, controlled-vocabulary backed — that turns the WellFAIR workflow into daily practice for preclinical teams. Structured capture at source, from protocol design to file deposit.
Neuronautix · Tecniplast · Metadatapp · Anibio · eLabFTW · SoftMouse · LIRMM / Dr. Konstantin Todorov · Olden Labs · Metofico · Pistoia Alliance
Damien Huzard, PhD · Neuronautix · 27 May 2026
neuronautix.com/contact