Three priorities to deliver a cohesive and interoperable data infrastructure

Gavin Starks
2 min readOct 26, 2020

1. Design for search — the foundation for discovery and access

Data must be usable by machines, not just humans. Policies must mandate that data be machine-readable in order that it may be collected and used in an efficient manner.

As important is the ability to discover that the data exists, what it is, where it is from, and how it may be used. This ‘metadata’ is a priority to make available so that data may be found and information about it accessed. Policies must mandate the production of meta-data that will aid discovery.

This first priority is independent of the specifics of any taxonomy, ontology or other structural design. Such designs are numerous and domain-specific.

2. Address data licensing policies — the foundation of access and usage

Licensing can determine how data may be used. To unlock the value of Priority 1, policies must mandate the publishing of meta-data under an open license. This is essential to enable large-scale, many-to-many discovery that the data exists.

Policies should mandate the publishing of any non-sensitive data under an Open license (this mirrors the open-by-default policies of many countries).

Policies should mandate the publishing of sensitive data under a Shared Data infrastructure framework.

3. Address data governance — the foundation of open markets

Data increases in value the more it is connected. A focus on systemic cohesion and interoperability reduces the burden of sharing by creating common rules and frameworks for sharing that address good data governance.

It ensures data is used appropriately for the purposes intended, addressing questions of security, liability and redress.

Related links

FAIR principles https://www.nature.com/articles/sdata201618

--

--