Connected Conversation: Collective data rights: do we need them and what should they look like?

Helena Hollis ▪ Sep 27, 2023

On 27th September we held an online conversation about collective data rights, which shared the results of an analysis we commissioned from AWO of three concrete scenarios (in policing, surge pricing, and online content moderation) in which people harmed by automated decision making are not data subjects. After sharing this analysis, we had a wider discussion on collective versus individual rights, on the scenarios raised, and wider issues of data governance.

The context

Jeni started the session off by providing some context to the commissioned work. In the UK context, one of the shortcomings of the proposed replacement to GDPR (DPIB) is that it assumes automated decision making is about data subjects, who are defined as identifiable individuals to whom personal data relates, but does not account for ways automated decision making can impact us without using such personal identifiable data at all. This prompted Jeni to think about what community/collective rights are needed, and how to concretely show where current and proposed legislation covers these or not.

As a means of highlighting how current individualises approaches to data rights that centre on personal data may be limited, the 3 scenarios where non-personal data is used were developed:

Policing: The police could use a data driven tool that takes fully anonymised statistics on past arrests and uses these to ‘predict’ where future offences may be most prevalent, in order to inform how officers will use their powers.
Transport pricing: railway companies could introduce ‘demand-led pricing’ systems that change ticket prices to make quiet routes and times cheaper, and busier ones more expensive. This would require only data about available routes and anonymised statistics on capacity.
Content moderation: A social media platform could use an AI tool to block potentially harmful content, which could be trained on previous content removal cases. A human moderator would only deal with appeals by posters who feel their content was unfairly blocked.

None of these cases involves personal data use, but there are potential harms that would occur at a collective, community or societal level. Communities who have in the past had higher rates of arrest in their neighbourhood due to bias by police officers could see an amplification in police using ‘stop and search’ powers on the assumption that more crime is likely to take place there. Commuters who have no choice over the time of day or route they take to work, and especially those who depend on public transport as their only means of travel, could face exorbitant fares. Social media users who are the target audience for legitimate but possibly controversial content, such as LGBTQ+ community members, could find it hard to find relevant, helpful and supportive content. In all of these cases, we could point to individuals who experience harm, but we can also see that the harm can be collectively distributed. We can also see how these data-driven forms of algorithmic decision making will shift our society overall in ways we may not be comfortable with.

The analysis

Alex from AWO gave a brief summary of the analysis his team conducted utilising these scenarios to interrogate where protections are available through existing legislation.

Firstly, he noted that while GDPR is engaged in more situations that we might assume to be the case (i.e. anywhere data subjects and their personal data are in play), even where GDPR is engaged, it is not straightforward to challenge algorithmic harms (and AWO’s related work with the Ada Lovelace Institute demonstrates this).

Furthermore, when we think about algorithmic decision making using non-personal data we tend to look at harms that are more contested, and involve trade offs, than in other areas (e.g. the policing scenario). Even in these scenarios, there are currently limited legal avenues to constrain harms.

Alex reviewed three legal avenues for addressing the kinds of harms caused by the algorithmic decision making in the scenarios:

Equality law is the most common approach here, but it is hobbled by a weak regulator, and it is generally unrealistic for an average person to bring claims under the equality act.
Public law is relevant here in theory, but it is in early development for dealing with these kinds of issues. It is currently also very difficult, risky, and expensive to bring public law claims.
Consumer and competition law offers the best level of protection. Here there is a strong regulation and regulator with strong powers. In fact, this law provides collective recourse as an individual can bring proceedings on behalf of other people (who can choose to opt out).

Overall, the barrier of transparency poses a significant challenge for GDPR, or any other legislation, that seeks to prevent the kinds of harms these scenarios illustrate. All avenues for challenge rely upon us knowing what is going on, and being able to evidence how such algorithms are operating.

Out of the three scenarios, Alex noted that the worst position is in the case of online speech, and indeed the Online Safety Act may make these harms worse rather than better by encouraging platforms to err on the side of removing contentious content.

The discussion

Jeni kicked off the discussion with 3 prompts:

The kinds of decisions described in the scenarios get made anyway - what is it about automation that makes community rights necessary? Is there anything different between automatic versus bureaucratic decision making?

Community consent has been advocated for, such as in the indigenous data rights literature. But community consent is difficult, i.e. how do you tell what the community as a whole wants? And in GDPR individual rights to consent are not always needed in any case. Is community consent a helpful approach?

How should we approach collective rights when it comes to personal data? Are the kinds we would want to have collectively or individually the same?

International context

It was noted that while the analysis centred on UK and EU law, we are operating in an international context, and in international law many solutions have been proposed over the years but none seems to have been successfully implemented.

Furthermore, the concern was raised that regulation can be (and had been) utilised as a form of empire building with the major global powers (US, EU, China) determining how data driven technologies are developed across the globe.

The individual alongside the collective

While there was consensus that individual libertarian models of data governance don’t work, and we need to explore alternative horizontal approaches, there was also a concern that collective data rights should not be a replacement for individual rights. Any model of collective data rights proposed ought to be supplemental to an individual rights baseline, so this is sustained, and collective data rights add additional protection. To this end, it was asked if we can identify where we don’t have individual protections, and if we can then ‘surgically’ apply additional rights to fill these gaps.

We discussed the relational effects of algorithmic decision making, referencing Salome Viljoen’s paper on data relationality. Erosion of social cohesion and trust, the erosion of institutions, shifts in power concentration and the impacts on equality, were all discussed as societal impacts, where we do not merely have individual interests. We touched upon the invisible social relations in data, and how such data can be used to impact society in a relational way. This led to a sentiment that while individual rights are essential, we need to find ways to put more emphasis on relationships between people in society in our approaches to data.

Issues with collectives and communities

We discussed a range of issues with defining and identifying communities or collectives, as well as the issues in deciding which communities rights get precedent over others:

Who gets to define the collective, and how?
Who represents the collective? What does it mean to have representation? Who gets to speak for a community?
Collective rights can be seen as exclusionary by design, implying an in-group and out-group. We can see this played out in human rights law, where it has struggled with not giving rights to particular groups over others.

It was also noted that we may not ourselves know that we are part of a group, or self-identify in that way. There may therefore be an important area for further exploration into mechanisms for recognising how groups exist; self-identified communities, versus community as defined by a collective impact we can trace.

Legal approaches

The conversation also included ideas on how to practically apply our concerns within legal contexts. It was agreed that collective data rights offer some principles that need to be translated into legal frameworks. A further need for conversation on political theory, and institutional design questions that come from collective interests, was also identified.

We also discussed whether we need specific updates into current legislation, or whether current protections are enough to work within. Here, it was noted that compelling cases for collective data rights can be found where existing apparatus for solving problems exists (e.g. tribal communities in the USA already have mechanisms for identifying community, community interests are already acknowledged and now can be translated into the digital realm).

However, scope for preemptive action before harm takes place was identified as a gap in current legislation. A mechanism for requiring that institutions pause and review before implementing systems they don’t fully is needed.

Furthermore, more emphasis on accountability and transparency points, which would enable collective forms of deliberation in tandem with impact assessments would have important benefits. And let’s not forget an empowered regulator.

For future work, some broadening of the issue at hand was advocated for. It may be that the problems we are seeking to address are not merely about data but broader digital infrastructure. It was also suggested that a reframing from strictly focusing on collective data rights, towards collective codetermination of use of tech in context, could be helpful.

Is it data or decisions?

One of Jeni’s prompt questions proved especially thorny; is there anything different about algorithmic decision making, than any decision making by humans?

It was noted that the three scenarios entail ways of making decisions that would be problematic regardless of the technology. These scenarios raise issues beyond data (e.g. institutional racism, what do we want from our transport infrastructure, etc.). It is important to keep sight of that, as we ask whether automation and use of data in some way accelerates or changes the quality of the kinds of harms we are talking about.

There are no easy answers here. On the one hand, participants in our conversation advocated for separate legislation on algorithmic decision making that goes beyond the nature of the data used. On the other hand, it was clear in our discussion that solely automated decision making isn’t the only issue, and ‘human in the loop’ can also be problematic.

The next steps

Following this thought provoking conversation, the report we’re producing discussing AWO’s analysis will be updated. This report will also form the basis of future work, including what we at Connected by Data advocate for in future iterations of DPIB and AI legislation in the UK.

Connected by Data will also host a further connected conversation that picks up some of the themes raised here.