Effective Data Governance

Jeni Tennison

Jeni Tennison

Our work at CONNECTED BY DATA is about improving data governance, so that organisations are more effective at making decisions about data towards justice, equity and sustainability. We believe this means that data governance needs to be collective, democratic, participatory, deliberative and powerful. This post unpicks what these terms mean in practice.

What is data governance?

Data governance is the process of making decisions about data.

Metaphors like “data is the new oil” or “data is like water” gloss over the fact that data isn’t a natural phenomenon. We design data. We make decisions about what data gets collected, how it’s modelled, how it gets used, who it gets shared with, when it gets deleted and so on and so on.

There are two aspects of data governance that we need to consider:

  • The outcomes of data governance decisions – the way data is collected, used and shared – can be harmful or beneficial. They can lead to over surveillance; to privacy violations; expose people to risks of fraud; increase existing inequalities or create new ones; unsustainably draw on natural resources. Data can also be used to discover better treatments; target help to those who need it; fight climate change; power global collaborations; support children to learn; and invent new things to make the world better.
  • The process of data governance can also be positive or negative. It can create a sense of collective purpose; build understanding and data literacy; and strengthen trusting relationships. Or it can leave communities feeling disenfranchised; fuel rumour and disinformation; undermine trust in institutions; and embed information and power asymmetries.

In practice, both the outcomes and process of data governance are often a mixed bag. The resulting algorithms and AI provide some benefits and cause some harms for different people, communities and organisations. Different people and groups feel represented or included differently at different stages. This is where the values and priorities of those making data governance decisions are important. Who benefits, and where harms fall, matters. So does who is included in the data governance process, when and how.

We at CONNECTED BY DATA aim for social justice: a more just, equitable and sustainable world. So we favour outcomes and processes that benefit (and limit risks and harm to) marginalised communities, in preference to those that benefit (and limit risks and harm to) powerful governments and companies. But it’s also possible for everyone to benefit from data; for example, improved understanding and trust may increase voluntary data sharing, create new possibilities for research and innovation in the public good, and accelerate the development and adoption of new technologies that have wider benefits.

What does good data governance look like?

As we’ve explored both literature and practice around data governance, we’ve identified a number of mutually supporting characteristics that we consider central to good data governance.

Good data governance is collective, by which we mean that it recognises and considers collective interests in data, not just those held by either individual data subjects, nor just those of organisations that are data holders and users. This is important because modern data processing means that data about other people affects decisions about us (and vice versa), and because data has an impact on things we benefit from collectively such as a strong, equitable, healthy society; innovative economy; and diverse and sustainable environment. Data governance that factors in collective impacts is more likely to both avoid risks and harms, and positively achieve wider public benefits.

Good data governance is democratic, by which we mean that it is ultimately under the control of the people and communities that will be affected by it, rather than organisations that are data holders and users. This is important because otherwise data governance can suffer from a lack of legitimacy, where people question whether decisions made about data collection and use are really being made in their collective interests.

Good data governance is participatory, by which we mean that the people and communities who are affected by data processing are directly involved in the data governance process. There are many models for democratic decision making, including having elected representatives to make decisions for us, or having democratic institutions whose operation is held in check through transparency and accountability structures, through to more direct forms of democracy like referendums, and emerging practice rooted in mini-publics and sortition. Participatory data governance gets affected communities directly involved in decision making. You can have low levels of participation (e.g. by communities being given a small opportunity to object) and high levels of participation (e.g. by communities co-designing the outcome). We’d argue that more participation is generally better, but we also recognise it’s resource intensive and seldom perfect. Including those who are marginalised – the voices of those who wouldn’t otherwise be heard – is one of the most important aspects of meaningful participation.

Good data governance is deliberative, by which we mean that decisions are made based on thoughtful consideration of evidence and differing points of view, rather than being made based on gut reactions and identity allegiances. This is important because data and its impacts are complex and technical, needing space and time for consideration. Frequently the outcomes of data governance need to be nuanced and caveated, something that can be hard outside deliberative processes. As with participation, deliberation is resource intensive and while it’s important, it’s also necessary to match the scope and scale of deliberation with the importance and impact of the kinds of decisions that need to be made.

Finally, good data governance is powerful, by which we mean that it actually makes a difference to the way in which data gets collected, used, and shared. This might go without saying, but there have been many participatory and deliberative exercises around data whose conclusions are never adopted by data holders; and data advisory boards whose objections to particular uses of data get overruled. Good data governance is only ever going to be effective if it has teeth. That means having transparency about the decision making process and the resulting decisions from that process; accountability through active third-party scrutiny, whether that’s by civil society or by regulators; routes of appeal that enable those not intimately involved in the process to question it; and routes for redress when the data governance process goes wrong and causes harm, so there are real consequences for data holders and users.

To summarise, better data governance needs to take into account our collective interests, not just individual ones; be democratic and give power to the people who are affected by it; be as participatory as possible, so that those people are actually involved in the process; be deliberative so that proper thought is put into complex decisions; and be powerful, so that it actually makes a difference to how data gets collected, used and shared.

It’s only through improving data governance processes in these ways that organisations can build understanding and trust, and ensure that data governance is oriented towards bringing about a more just, equitable and sustainable world.