We are connected by data – our data is about other people too

Modern data processing puts us in groups – other people’s data affects us

In the digital economy, data isn’t collected solely because of what it reveals about us as individuals. Rather, data is valuable primarily because of how it can be aggregated and processed to reveal things (and inform actions) about groups of people. Datafication, in other words, is a social process, not a personal one. Further, it is a process that operates through a set of relationships. Only by highlighting the collective and relational character of datafication can we understand how it works, and the particular injustices that it produces. This isn’t just a theoretical exercise—it goes to the heart of what’s wrong with our digital world, and what may make it right.

Safety in Numbers? Group Privacy and Big Data Analytics in the Developing World

This chapter argues that group privacy is a necessary element of a global perspective on privacy. Addressing the problem as a new epistemological phenomenon generated by big data analytics, it addresses three main questions: first, is this a privacy or a data protection problem, and what does this say about the way it may be addressed? Second, by resolving the problem of individual identifiability, do we resolve that of groups? And last, is a solution to this problem transferrable, or do different places need different approaches? Focusing on cases drawn mainly from low- and middle-income countries, this chapter uses the issues of human mobility, disease tracking and drone data to demonstrate the tendency of big data to flow across categories and uses, its long half-life as it is shared and reused, and how these characteristics pose particular problems with regard to analysis on the aggregate level.

Algorithms and AI use invisible similarities to make decisions about us

Recommendation systems collect customer data and auto analyze this data to generate customized recommendations for your customers. These systems rely on both implicit data such as browsing history and purchases and explicit data such as ratings provided by the user.

Data governance law—the legal regime that regulates how data about people is collected, processed, and used—is a subject of lively theorizing and several proposed legislative reforms. Different theories advance different legal interests in information. Some seek to reassert individual control for data subjects over the terms of their datafication, while others aim to maximize data subject financial gain. But these proposals share a common conceptual flaw. Put simply, they miss the point of data production in a digital economy: to put people into population-based relations with one another. This relational aspect of data production drives much of the social value as well as the social harm of data production and use in a digital economy.

In response, this Article advances a theoretical account of data as social relations, constituted by both legal and technical systems. It shows how data relations result in supra-individual legal interests, and properly representing and adjudicating among these interests necessitates far more public and collective (i.e., democratic) forms of governing data production. This theoretical account offers two notable insights for data governance law. First, this account better reflects the realities of how and why data production produces economic value as well as social harm in a digital economy. The data collection practices of the most powerful technology companies are primarily aimed at deriving population-level insights from data subjects for population-level applicability, not individual-level insights specific to a data subject. The value derived from this activity drives data collection in the digital economy and results in some of the most pressing forms of social informational harm. Individualist data subject rights cannot represent, let alone address, these population-level effects. Second, this account offers an alternative (and it argues, more precise) normative argument for what makes datafication—the transformation of information about people into a commodity—wrongful. What makes datafication wrong is not (only) that it erodes the capacity for subject self-formation, but also that it materializes unjust social relations: data relations that enact or amplify social inequality. This egalitarian normative account indexes many of the most pressing forms of social informational harm that animate criticism of data extraction yet fall outside typical accounts of informational harm. This account also offers a positive theory for socially beneficial data production. To address the inegalitarian harms of datafication—and develop socially beneficial alternatives—will require democratizing data social relations: moving from individual data subject rights, to more democratic institutions of data governance.

Part One describes the stakes and the status quo of data governance. It documents the significance of data processing for the digital economy. It then evaluates how the predominant legal regimes that govern data collection and use — contract and privacy law — code data as an individual medium. This conceptualization is referred to throughout the Article as “data as individual medium” (DIM). DIM regimes apprehend data’s capacity to cause individual harm as the legally relevant feature of datafication; from this theory of harm follows the tendency of DIM regimes to subject data to private individual ordering. Part Two presents the core argument of the Article regarding the incentives and implications of data social relations within the data political economy. Data’s capacity to transmit social and relational meaning renders data production especially capable of benefitting and harming others beyond the data subject from whom data is collected. It also results in population-level interests in data production that are not reducible to the individual interests that generally feature in data governance. Part Three evaluates two prominent legal reform proposals that have emerged in response to concerns over datafication. Propertarian proposals respond to growing wealth inequality in the data economy by formalizing individual propertarian rights over data as a personal asset. Dignitarian reforms respond to how excessive data extraction can erode individual autonomy by granting fundamental rights protections to data as an extension of personal selfhood. While propertarian and dignitarian proposals differ on the theories of injustice underlying datafication (and therefore provide different solutions), both resolve to individualist claims and remedies that do not represent, let alone address, the relational nature of data collection and use. Part Four proposes an alternative approach: data as a democratic medium (DDM). This alternative conceptual approach apprehends data’s capacity to cause social harm as a fundamentally relevant feature of datafication; from this follows a commitment to collective institutional forms of governing data. Conceiving of data as a collective or public resource subject to democratic ordering accounts for the importance of population-based relationality in the digital economy. This recognizes a greater number of relevant interests in data production and recasts the subject of legal concern from interpersonal violation to the condition of population-level data relations under which data is produced and used. DDM therefore responds not only to salient forms of injustice identified by other data governance reforms, but also to significant forms of injustice missed by individualist accounts. In doing so, DDM also provides a theory of data governance from which to defend forms of socially beneficial data production that individualist accounts may foreclose. Part Four concludes by outlining some examples of what regimes that conceive of data as democratic could look like in practice.

Algorithms connect us to each other…

Our family
and friends

People in our
household

People in our
neighbourhood

People who travel
the same routes

People with the same
friends or followers

People who share our
interests and preferences

People our age, race,
gender, or social class

What those other people choose to share affects what is known about us

Aleksandr Kogan, a data scientist at the University of Cambridge, was hired by Cambridge Analytica, an offshoot of SCL Group, to develop an app called “This Is Your Digital Life” (sometimes stylized as “thisisyourdigitallife”). Cambridge Analytica then arranged an informed consent process for research in which several hundred thousand Facebook users would agree to complete a survey for payment that was only for academic use. However, Facebook allowed this app not only to collect personal information from survey respondents but also from respondents’ Facebook friends. In this way, Cambridge Analytica acquired data from millions of Facebook users.

We are connected by data – our data is about other people too

Modern data processing puts us in groups – other people’s data affects us

Data Relations

Safety in Numbers? Group Privacy and Big Data Analytics in the Developing World

Algorithms and AI use invisible similarities to make decisions about us

Recommendation Systems: Applications and Examples in 2022

A Relational Theory of Data Governance

Algorithms connect us to each other…

What those other people choose to share affects what is known about us

Facebook–Cambridge Analytica data scandal

Do you collect, use or share data?

We can help you build trust with your customers, clients or citizens

Do you want data to be used in your community’s interests?

We can help you organise to ensure that data benefits your community