FAIR metadata · Open science

fair.md: a one-file, honest FAIR self-declaration for any repository

8 June 2026 Damien Huzard, PhD

FAIR is widely endorsed — but there is no conventional, readable place where a project can state its own FAIR posture in five seconds, honestly, with gaps declared. A single Markdown file placed at the repository root, combining a YAML self-assessment with human-readable prose, can close that gap for any project in under an hour.

The gap: widely endorsed, unevenly declared

The FAIR Guiding Principles — Findable, Accessible, Interoperable, Reusable — were articulated in 2016 precisely because scientific data had become too fragmented and inconsistently described to support reliable reuse [1]. A decade on, FAIR has become a standard phrase in funding calls and data management plans, yet concrete, inspectable self-declarations of FAIR posture at the level of individual repositories remain rare — typically buried in PDF data management plans or absent entirely [1].

The operationally relevant failure is not a dispute about the principles. It is the absence of a conventional, zero-friction place where a project can say: here is exactly how FAIR this resource is right now, and here is what is still missing. Without that, FAIR endorsement remains a policy gesture rather than a verifiable baseline.

What fair.md is

fair.md borrows its ergonomics from llms.txt [6] — Jeremy Howard's 2024 proposal for placing a well-known Markdown file at the root of a site so machines can find context without scraping HTML. The idea is the same: one file, root level, readable by a human in under a minute and parseable by a crawler without any tooling.

The file is Markdown with a YAML front-matter block. The YAML carries the machine-readable payload; the prose section beneath it is for people. Required keys cover what the repository is (title, description), who maintains it (maintainers with ORCID), what it licenses (license, content and code separately), what data resources it contains (data_resources), and which controlled vocabularies it references (vocabularies). The structural heart is fair_assessment.

fair_assessment contains four sub-blocks — findable, accessible, interoperable, reusable — each with one key per canonical FAIR sub-principle (F1–F4, A1–A2, I1–I3, R1–R1.3). Every key takes one of five status values: yes, partial, planned, no, or n/a. The key names are prefixed with the canonical sub-principle identifier so that automated FAIR assessment tools can map them directly [1].

Honesty as a feature: why partial and planned are the point

The most important design decision in fair.md is the status enum. A format that only permitted yes and no would push maintainers toward either inflated claims or non-adoption. partial and planned exist so that a truthful, improvable baseline is always preferable to a silent one [1].

This reflects the spirit of the FAIR principles themselves: they are intentionally framed as aspirational and incremental, not as a binary pass/fail certification. A repository that is genuinely partial on interoperability and says so is more useful to science than one that claims full compliance without the accompanying evidence [1].

Because the status values are a controlled enum and the sub-principle keys are named after canonical identifiers, a fair.md can be consumed and compared across repositories by automated tools — the same scripts that today scrape PDF data management plans could instead parse a well-formed YAML block [1].

A front door, not a replacement

The ecosystem already has excellent heavy-weight companions. codemeta.json [2] provides rich machine-readable software metadata — versioning, authorship, dependencies. CITATION.cff [3] covers citation metadata in a format that GitHub, Zenodo, and citation managers can consume directly. Both are excellent, but they are verbose and rarely read by a human browsing a repository.

RO-Crate [4] and FAIR Signposting [5] operate at a higher level still: robust, specification-grade packaging and HTTP-level navigation for FAIR Digital Objects, aimed at institutional repositories and infrastructure providers. They are the right choice when you need a fully packaged FAIR Digital Object with all provenance, relations, and typing formalised.

fair.md sits in front of all of these: it is cheap to write, honest about gaps, and points to the companions via a companions block. A visitor to a repository — human or crawler — can open /fair.md, read the self-assessment in thirty seconds, and then follow the pointers to whatever depth of machine-readable metadata they need. The companions keys make the ecosystem legible without duplicating it [1][2][3][4][5].

DESIGN.md [7] — adopted by google-labs-code, among others — shows the same pattern applied to design documentation: one file, root level, no tooling required to read it. fair.md follows the same ergonomic principle, extending it to FAIR provenance.

The worked example: this site's own fair.md

The reference implementation lives at https://neuronautix.com/fair.md. It covers three data resources: the source-backed Markdown knowledge base (/knowledge/), 19 cited analytical notes (/notes/), and self-contained HTML presentations with source maps (/presentations/). Maintainer is recorded with ORCID. Identifiers include the repository, homepage, and canonical URL. There is no DOI yet — declared explicitly as null rather than omitted [1].

The self-assessment is honest about the full spread of the site's FAIR posture. On the Findable axis: F2 (rich metadata), F3 (metadata references data identifiers), and F4 (sitemap + robots.txt indexed) are all yes. F1 — globally unique persistent identifiers — is partial: every page has a canonical HTTPS URL, but no DOIs have been minted yet. On the Accessible axis: HTTPS retrieval with no authentication is straightforwardly yes; A2 (metadata persistence beyond data) is partial because git history provides some continuity but no formal tombstoning exists [1].

The current gaps are concentrated in Interoperability and Reusability. I1 — formal knowledge representation — is partial: content is published as HTML now, JSON-LD embedding is planned. I2 — FAIR vocabularies — is partial: vocabularies such as schema.org and NAMO are referenced in content but not yet machine-embedded as structured data. I3 — qualified references — is yes via inline [n] citations throughout. On the Reusability axis: the license is now declared Apache-2.0 (yes); R1.2 is yes via /trust.md; R1.3 (domain community standards) is partial [1].

This is the format working as intended. The partial and planned entries are not failures — they are a roadmap. Anyone reading /fair.md for this site can see exactly where the next improvements should come from: minting a DOI, embedding JSON-LD, adding CITATION.cff. The self-assessment compounds in usefulness over time because it is reviewable and versionable [1].

How to adopt fair.md

Adoption is deliberately low-friction. The full workflow in five steps:

Copy the reference file (or the template in the spec repository) to the root of your repository as fair.md.
Replace the YAML with your project's values. Fill in fair_assessment honestly: use partial and planned where they apply. The only wrong answer is an inflated yes [1].
Serve it at https://yourdomain/fair.md. Optionally redirect /.well-known/fair.md to /fair.md for programmatic discoverability by FAIR assessment crawlers [1].
Add the companions you have. CITATION.cff [3] is the highest-value first step — an hour of effort, and GitHub renders it as a "Cite this repository" widget. codemeta.json [2] covers software metadata in detail. RO-Crate [4] when you need a packaged FAIR Digital Object. FAIR Signposting [5] when you control server headers and want HTTP-level navigation [2][3][4][5].
Pair it with trust.md if your repository publishes knowledge, analysis, or AI-assisted content. fair.md answers: can you find and reuse this resource? trust.md answers: how much should you trust what it says? Together they provide both axes of provenance. A dedicated note on trust.md is forthcoming on 2026-06-10.

Review the file periodically. The last_reviewed field carries an ISO date — a stale date is itself a signal. The convention is designed to be updated in minutes, not hours: the YAML block changes, the prose updates, and a commit records the version history [1].

Where fair.md fits in open science infrastructure

If fair.md achieves even a fraction of the adoption that llms.txt has, it would give FAIR assessment tools a structured, standard entry point they currently lack across the long tail of research repositories that never reach a major data repository infrastructure [1][6].

The structured YAML makes aggregation tractable. A harvest of /fair.md files across a community — a field, a funder's portfolio, a GitHub organisation — could produce a FAIR profile of that community without any manual annotation, because the status enum is controlled and the keys are standardised [1].

Even without any tooling or aggregation, the minimal value of fair.md is that it makes a declaration exist. A researcher or funder visiting a repository can open one file and see, in plain language, what the project claims about its own FAIR posture and where the gaps are. That is a low bar — and currently unmet by most repositories [1].

References

[1] The FAIR Guiding Principles for scientific data management and stewardship — Wilkinson MD et al. Scientific Data. 2016. Founding statement of the Findable, Accessible, Interoperable, Reusable framework; the sub-principle identifiers F1–R1.3 that fair.md maps to directly.
[2] The CodeMeta Project. Machine-readable software metadata standard; companion to fair.md for software repositories, pointed to via the companions.codemeta field.
[3] Citation File Format (CITATION.cff). Machine-readable citation metadata; recommended as the highest-value first companion step after fair.md adoption.
[4] RO-Crate. Specification for packaging and describing FAIR Digital Objects; the full-weight companion to fair.md for packaged datasets and research objects.
[5] FAIR Signposting Profile. HTTP-level navigation for FAIR resources using typed link headers; the infrastructure-tier complement to the fair.md front door.
[6] llms.txt — Jeremy Howard, 2024. Convention for placing a well-known Markdown file at a site root for LLM consumption; the ergonomic model that fair.md directly inherits.
[7] DESIGN.md — google-labs-code. Convention for root-level design documentation in Markdown; an ergonomic precedent for placing structured declarations at the repository root.

Work with Neuronautix

Apply FAIR principles to your research repository

Neuronautix provides independent consulting on FAIR metadata strategy, data management planning, and the lightweight conventions — fair.md, CITATION.cff, RO-Crate — that make research outputs genuinely reusable. Contact us to discuss how this applies to your project or portfolio.