What is Shared Data?
(updated based on reader feedback on 2020–06–09; copy here)
A central area of focus for Icebreaker One is to increase access to useful Shared data. What does this mean?
We would all like to ensure there is a fair value exchange (‘reciprocity’) around data usage: clear rules that make sharing secure, that protect people, companies and countries from potential harms, while not stopping potential benefits.
Open data can be used by anyone for any purpose for free
(e.g. Creative Commons, Open Government Licence)
Shared data is data with a preemptive licence for a specific use
(e.g. ‘data as a service’ that can be used with certain restrictions)
Closed data requires a user-specific custom licence/contract for use
(e.g. ‘bilateral contract’ for a specific project; employment contract)
How can we better frame data-sharing around user needs?
When thinking about data-sharing, rather than saying just ‘open up all the data’, we recommend starting with the question ‘how am I allowed to use it’. This links us to the ‘so what’ question — what problems are we trying to solve?
The phrase ‘open data’ can mean many different things to different people. We have spent many years as a community, as well as through organisations such as the Open Data Institute (of which I was CEO), the Open Knowledge Foundation (of which I was a non-executive director) and others, trying to ensure we all had one definition:
Open Data can be used by anyone for any purpose for free
Examples of open data include public data such as the human genome, a bus timetable or any of the 55,000 data sets here (you may be surprised that it took quite a bit of effort to get bus timetables to be open in the UK). We also defined that personal data is not open data — for many reasons — and that data is now covered by regulations such as GDPR.
Everyone thinks their data is valuable — it is. But how we measure and exchange value is something we need to explore.
We make decisions that the ‘value exchange’ in making certain data open that has been funded by the taxpayer should be open because, as taxpayers, we’ve already paid for it.
Reciprocity = fair value exchange
But value comes in many forms. Here, I’m going to talk about it in terms of reciprocity — do we feel as if we’ve had a fair value exchange. If I share some kind of data with you so that you can provide something back, we may choose to exchange it without a cash payment. Value, however, is exchanged.
This is important — from an economic perspective, we can talk about moving the costs of certain value exchange to what’s called ‘marginal cost’ (a cost we absorb as part of our usual business operations).
If there is reciprocity then we feel value flowing in both direction, even if it’s not a direct 1:1 exchange we can see the broader benefit to the market in which we are operating and that can benefit our business through operational efficiency, risk management or opportunity generation.
Open banking is an example of market-wide reciprocity
Open Banking is a perfect example of this. Firstly, it mandates that banks publish their product information as Open data. This makes it easier to find and analyse products that might fit our needs. The value exchange (reciprocity) is that by making it easier for people to find products that suit their needs, banks will get a better fit of customers-to-products which can increase the likelihood of having a happy customer. This is a win-win.
Open Banking also mandates that personal financial data (e.g. bank statements) can also be transferred between banks by the customer without a financial cost. But this data is not open and it’s not ‘free’. Firstly, it is either personal or commercially sensitive data, so it cannot be open. Secondly, it is not free as there is a material cost to provide that scale of data management.
However, all the banks have agreed to this because (aside from it being regulated) there is a mutual benefit. The market as-a-whole benefits, the costs all ‘balance themselves out’ — there is reciprocity. Further, while some banks used to feel that holding on to their customer’s data was paramount, it’s not the customer-value point that they should be competing upon. Furthermore, with GDPR, the data is controlled by the end customer.
The rules that govern this data exchange are encoded into the Open Banking Standard. It covers everything from the rights surrounding the data to the liability transfer as data flows. It is a commercially-focussed framework that allows data-sharing. These rules are now both common and shared across the whole market. It effectively defines the rules for sharing in advance.
If we frame this as a ‘license’ (a set of rules) to share data then we can define Shared data as follows:
Shared data is preemptively licensed.
After this, what’s left? Either data that you don’t want to share outside of a specific group (e.g. people contracted to work for your company), or that is only shared using bilateral contracts, where each contract needs to be unique.
Closed Data requires a user-specific custom licence/contract for use
For example, a bilateral contract for a specific project, or access to information enabled via an employment contract.
Data increases in usage & value the more it is connected
For example, Creative Commons defined a step-change in thinking. It enabled us all to say “it’s okay to use this image for free” in advance. As of May 2018, there were an estimated 1.4 billion works licensed using a CC licence.
So with Shared Data, if stakeholders published their data descriptions and their licensing options per type of use (aka ‘preemptive licensing’), then other stakeholders can just access it — compliant to their respective licensing requirements. This can enable people to create different types of value exchange, including granular payment structures for different types of use.
We must also have clear Open Data descriptions of the Shared Data and how it might be used (how it is licensed).
Publishing open data that describes the shared data will enable search engines (and therefore you) to find it. If the licensing is clear, then the friction between discovery and usage is reduced.
Doing so will increase the size of the observable dataverse and help to unlock innovation while protecting the interests of individuals, organisations and countries to use it for both public and private good.
An interesting example is the UK Open Banking Standard, which preemptively defines and mandates ways to share personal and business data — it is now regulated with every UK high street bank engaged and over 300 fintech companies in its ecosystem.
You can read more about the evolution of the Data Spectrum here.
I write more about the implications of this approach below