FAIR metadata
FAIRSCAPE and BioCompute make NAM provenance reviewable
For NAM evidence, provenance is not just where the file came from. It is the chain from biological material and assay conditions to computation, endpoint, interpretation, and claim. FAIRSCAPE and BioCompute offer two practical patterns for making that chain reviewable.
NAM provenance is broader than file lineage
A NAM result often depends on biological material, device setup, protocol version, exposure timing, endpoint extraction, data transformation, and statistical or machine-learning workflow. If any part is missing, the result may be hard to reproduce or impossible to compare with another site. That is why provenance needs to be captured as a structured evidence chain rather than a comment in a report.
This is especially important for AI-derived endpoints. A toxicity flag produced by an image model or transcriptomic classifier is only interpretable when the input data, model version, preprocessing, parameters, validation set, and uncertainty are visible.
FAIRSCAPE models the evidence graph
FAIRSCAPE extends FAIR principles to computational biomedical analytics by creating machine-interpretable provenance for datasets, software, computations, runtime parameters, environment, and personnel. Its evidence graph concept is useful for NAMs because it treats a result as something supported by a chain of data and computation rather than by an isolated output file.
For a NAM programme, that pattern maps well to questions reviewers and data scientists actually ask: Which raw data created this endpoint? Which script processed it? Which container or runtime was used? Which parameters changed? Which result supports which claim?
BioCompute documents computational workflows
BioCompute Object, IEEE 2791-2020, is a JSON-based standard for documenting computational workflows with provenance, descriptive metadata, user attribution, inputs, outputs, execution details, and parameters. It was developed for bioinformatics and high-throughput sequencing contexts, but the structure is relevant to computational NAMs.
A QSAR model, PBPK simulation, image-analysis pipeline, or omics-derived safety endpoint can all benefit from a BioCompute-like record. The aim is not to force every NAM into a genomics standard. The aim is to borrow a regulatory-facing pattern for making computational steps inspectable.
A practical hybrid for NAMs
A NAM provenance package can use FAIRSCAPE-style evidence graphs at the evidence level and BioCompute-style records for computational steps. The experimental metadata should remain assay-specific: donor, cell type, chip architecture, flow, matrix, medium, exposure, controls, and endpoint definitions. The computational metadata should record software, model, parameters, data transformations, and execution context.
Together, these records make AI-ready NAM evidence less dependent on narrative reconstruction. The reviewer can inspect what happened; the data scientist can decide whether a dataset is trainable; the sponsor can reuse evidence without rediscovering its assumptions.
Sources and further reading
- FAIRSCAPE: a Framework for FAIR and Reproducible Biomedical Analytics — Niestroy et al., 2022. FAIRSCAPE evidence graphs and machine-interpretable provenance.
- FAIRSCAPE — FAIRSCAPE project. AI-readiness, semantic provenance graphs, biomedical Datasheets, Croissant metadata, validation, API, and GUI.
- BioCompute Object Documentation — BioCompute / IEEE 2791-2020. Workflow documentation with provenance and metadata.
- About BioCompute — BioCompute. JSON domains for provenance, usability, description, execution, inputs/outputs, and parameters.
- Croissant Format Specification — MLCommons. Metadata format for machine-learning datasets, including provenance and responsible AI documentation.
Work with Neuronautix
Make NAM provenance audit-ready
Neuronautix helps teams map computational and experimental provenance into structured records that support review, reuse, and AI-readiness.