Connected by data: Why this? Why now?

Jeni Tennison

Jeni Tennison

Welcome to this new initiative: Connected by data!

It almost goes without saying that data and AI are not working for us at the moment. The past few decades have seen an explosion in the availability of data, driven both by the web and the ability to put sensors in everything. But we are still, as a society, getting to grips with what this means for us, and how to manage its impact so it helps, rather than harms us.

I believe we need to put community at the centre of our data narratives, practices and policies to address the problems we’re seeing.

Why is community important?

There are several reasons why we need to focus on communities when thinking about data.

Firstly, it’s what modern data processing does. In the Facebook / Cambridge Analytica scandal, the reason data about so many people was captured was because their friends and family members (and Facebook!) permitted it. One person completing the quiz enabled access to their community’s data.

Some of the links between us are obvious, such as our friendships and follows on social media, or the DNA we share with our family. There are less obvious ones too, such as the way other people’s film preferences influence what Netflix recommends you watch, other people’s purchases change what you see on Amazon, or other people’s Googling influence your search results. Our data is processed together, so we need make decisions about it together.

Secondly, there’s a moral imperative. The negative impacts of data processing fall most heavily on those who are already marginalised in our societies. They might experience it in their ability to participate in games, as racial slurs in search results, or as higher insurance prices.

Many times, these biases are missed because it’s impossible for any one person experiencing them to detect – they only see one price or one search result. So it is essential to understand impact at the scale of community. And vital that the members of those communities get to have a say about them.

Finally, we have tried managing data use through individual choices. It has not worked and is not likely to. There is a narrative that blames the privacy paradox – the gap between our stated preferences and our continued use of privacy-invading services – on low data literacy, lack of information, or poor tooling. But these decisions are hard for other reasons: modern data processing is inherently complex; we often have no real choice about what services we use; and companies manipulate choice architectures to create the results that work for them.

Requiring individuals to be responsible for decisions about what data we share and how it is used is like requiring us to be responsible for not eating things that are going to poison us. There is certainly some room for personal choices – am I going to lick the bowl even though the cookie dough contains raw eggs? – but there is a larger system of regulation that saves us from eating plastic in our vegetable balls. We need similar systems of regulation at the societal level for data, that recognise the limits of individual consent.

What needs to be done?

The need to develop collective approaches to data governance is well recognised in the academic and data governance community, but isn’t part of the popular narrative about data, the common practice of organisations, nor embedded in our legal frameworks.

On the contrary, the current response to issues with data is to double down on individualised approaches. In the wake of the Facebook / Cambridge Analytica scandal, for example, will.i.am said “we need to own our data” and the Financial Times said “privacy rights require data ownership”.

These approaches lead to the diversion of effort and investment into the creation of personal data stores. They lead to services offering data monetisation, which only reinforce privacy as a luxury rather than a human right that cannot be sold away. They continue to diffuse the power we could have collectively, by making us act individually.

I think we’re lacking three things:

  1. Compelling narratives. “You own your data” is intuitively attractive and superficially empowering. Who can (or would want to) argue against organisations providing “transparency and control”? We need to find compelling ways to make the case for more collective approaches.

  2. Practical techniques. There is no one-size-fits-all approach for making collective decisions about data. We need a range of methods from equipping local and national governments to make decisions about data through representative democracy; through large-scale deliberative approaches such as citizen juries; to survey-style approaches that are more suited to cash-strapped SMEs. There is lots of experience to draw on, and, I suspect, a few new approaches that need to be tested, to create practical guides for organisations to adopt.

  3. Legal obligations. The weakness of a lot of participative approaches for data governance is that they lack teeth: an organisation may hold a citizen jury to understand what’s important to their community, but they have no compulsion to adopt the results, especially if they go against their other drivers. We need both changes to the law and effective regulatory institutions to encourage organisations to use these methods, and ensure that the results are meaningfully adopted.

So these are the things that Connected by data will be working on. This first year we’re planning to focus on getting the narrative right, engaging with the development of the UK’s new data protection legislation so that it encourages meaningful collective approaches, and building out the initiative so it can have a larger impact.

If you are interested in any of this, please do get in touch by email or on Discord, or just follow us on Twitter.