Over last three weeks at the end of July 2022 I was an observer of the NHS AI Lab Public Dialogue on data stewardship: a process involving around 50 members of the public meeting for 12 hours (across four sessions) to share their ‘thoughts, aspirations, hopes and concerns’ about how access to healthcare data for AI purposes should be managed. A report of the dialogue was published by the organisers (Open Data Institute, Imperial College Health Partners and Ipsos), and the NHS AI Lab (who co-funded the dialogue along with Sciencewise) intend to use the findings to inform the Terms of Reference for a research competition titled ‘Participatory Fund for Patient-Driven AI Ethics Research’.
This write-up contains my notes as an independent observer of the dialogue, and member of the project’s Stakeholder group.
An observer in the corner
I attended three of the four dialogue sessions (Session 1, 3 and 4). These were all held via Zoom, with some time in plenary for expert presentations and recap of insights from past sessions by the lead facilitator, and most of the time spent in small breakout groups of 5 - 6 participants, each parallel group working on the same set of tasks, though often in slightly different orders to address the potential impact of ordering on priming particular discussions. The make-up of the break-out groups changed between sessions. In each session, I observed the plenary, and then was part of a single break out group in order to be able to follow full threads of discussion. Following the organisers guidance for observers, I did not have my camera on, and did not participate in the discussions in any way.
The first three hour session introduced a number of key topics, and asked participants in break-out groups to introduce their initial perceptions and understandings of health data research, how health data might currently be used, what AI is, and who currently makes decisions about health data. An expert presentation provided input on current data governance for healthcare, which participants then explored further in break-out groups. Case studies of AI use in healthcare were then shared, and discussed in breakouts.
The second and third sessions introduced a number of models for data access, grouped under the headings of ‘Delegated’, ‘Collective’ and ‘Individual’, and with a set of detailed slides on variants of each model, worked through in breakout sessions participant views on each model.
The final session started with a presentation of points raised in prior sessions synthesised as a set of ‘principles’ for data access. The different models for data access were then reviewed in breakout groups against these principles, and after this, breakout groups revisited the wording of the principles to suggest any adaptations to refine them.
Group discussions: eight observations
I didn’t make exhaustive notes on all the points raised during discussions, but I did note down reflections on a number of issues that either struck me as significant either for understanding public views on data governance, or that might be relevant to how we articulate different models of data governance in future, particularly collective models.
Many of these may seem ‘obvious’ (e.g. of course people value being engaged etc.) but (a) it may be notable that they were themes emerging from the discussions in this particular dialogue; and (b) not all are as obvious as they might seem (e.g. there might have been strong view that having to discuss details of data governance was a burden that should not be placed on the public).
(1) People value being engaged in data governance discussions.
The participants in this dialogue were not recruited based on any particular interest in data, health or AI. But a number in the breakout groups I observed commented on both how interesting they had found it to learn more about how their data is being used, and how much they had valued the process of being engaged. When thinking about the potential for future collective models of data governance, participants reflected on how their own experience provided a good basis for thinking that other members of the public would and could engage productively in detailed discussions about data access and use.
(2) Learning about existing governance arrangements was reassuring.
But participants in the breakout group I observed were also generally sceptical about how far their data would be secure in practice, drawing on experiences of trying to manage their privacy or data in existing online environments. There was fairly strong evidence of ‘digital resignation’, evident in statements such as “the internet is not a safe place - but it’s a risk we have to take” or “your data is going to get breached at some point - you’ve just got to deal with it”.
(3) Trust is for institutions, not processes.
In an insight that echoes the finding Aidan Peppin has reported from past dialogues, it was clear that when thinking about how far to trust particular arrangements for data access, it was questions of which organisations were involved that featured more strongly in people’s decision making than details of the particular governance process proposed. This raises some challenges, as the signals used to confer trust on institutions (e.g. which country they are from, which sector, whether the public have direct contact with them through trusted frontline workers like doctors and nurses), are not necessarily very reliable markers or how trustworthy an institution actually is. Overall, I was struck however that none of the discussions I observed really questioned the motives _and incentives_ of different actors, or explored how, in practice, boundaries between public and private can, at times, be quite blurred.
(4) International context matters.
The wider world beyond the UK was an important factor in how participants were thinking about data governance, though not always in the same ways. For some, this surfaced in concerns about data flowing to companies located overseas, because of a sense they may not be operating with the same values, or under the same governance rules, as UK firms. For other participants, their experience linked to diaspora communities, or reflections on global responses to the COVID pandemic, led to points around on how the benefits of research could, or should, be available worldwide. In yet other contexts, a participant wondered if with individual control they would be able to take their data with them if moving to another country.
This theme wasn’t unpacked in any great depth by the dialogue, but given health research can be transnational, and in a context of growing emphasis in some camps on ‘data sovereignty’, it seems important to consider the kinds of normative, and geopolitical, global issues that a collective data governance process in the future might need to take into account. \
(5) Collective fora need expertise, diversity and continual fresh perspectives.
As the breakout group I observed reflected on the detailed principles that could guide future experiments in data governance, they put significant focus on working out how to balance the need for groups of patients and the public to develop knowledge to be able to oversee data governance, with the need to avoid co-option and to keep a critical questioning approach. The discussion started to sketch out a quite sophisticated model of regular ongoing recruitment for patient/public panels, with a rolling refresh of membership, rather than replacing a panel en-masse after a set term.
(6) Whether panels are voluntary or paid might affect who participates.
In their detailed discussion of collective models, the group I observed also spent time exploring whether or not participants should be paid for their work: weighing the impacts of payment on inclusion, and likely effort that people may put into a process. After some deliberation, a fairly strong sense came across from the group that the role should be voluntary.
This was one point I found it particularly difficult to be just an observer, as I would have been keen to throw into the discussion a couple of models such as school governors (unpaid, voluntary), jury service (sometimes paid by employers, statutory amount that can be claimed), paying flat rate, fully paying for lost earnings and so-on, and to see how these landed with the group in terms of thinking about who they enable to participate.
(7) Individual data governance models raise particular issues of capacity.
There was strong enthusiasm for individual models of control over data when these were first introduced, linked both to the intuitive desirability of being in control, and a view that this might allow people to make decisions ‘at their convenience’ rather than on the timeline of institutions. Probing from the facilitator about potential problems of high levels of opt-out if individual data controls were implemented didn’t appear to resonate strongly with the groups I observed, but some issues were raised about how individual consent would work for children, that also pointed to the challenges of making consent work for others with limited capacity. It was also notable in discussions of individual control that even after talking positively about both delegated and collective models of data governance, participants tended to still express a fairly strong view on the primacy of consent: “It has to be shared only with consent.”, raising interesting questions about what it may take to develop meaningful public trust in systems that are based on opt-out, rather than opt-in to research data sharing.
(8) People overall prefer a blended approach to data governance, incorporating collective elements.
When asked about their preference for Delegated, Collective or Individual models of data governance, the groups I observed tended towards asking for a blend of collective and individual, but with elements of expert input that also suggested some desire to delegate aspects of decision making. While the idea of individual control immediately resonated with everyone in the groups I observed, after discussing collective models, the groups were interested in these being part of how data was governed too.
Following the sessions I’ve reflected on whether ‘Delegated’ and ‘Collective’ are the best labels for these kinds of data governance, as in practice, the models were really talking about ‘Expert-led’ or ‘Group-based’ decision making or influencing processes. It may be that refining the terminology about what ‘collective’ governance models look like in practice can make them more immediately and intuitively understandable for the public, balancing out the intuitive appeal of individualism.
At the same time, I suspect there will always be a need to do more to communicate the idea of taking decisions together, rather than ‘taking individual control’, and to challenge the cognitive biases that mean we may think we want control of all the data decisions - even though in practice, few people could meaningfully process the many data decisions that affect them everyday.
A reader might feel that not all the ‘insights’ above are entirely consistent or compatible with one another. That’s quite possible. The dialogue process surfaced, but did not resolve, some of the tensions that any data governance system will need to be aware of. Depending on your perspective, this could be a strength (revealing inherent tensions), or weakness (missing opportunity to see how public opinion might resolve tensions), of the method.
In the section below I’ll briefly unpack a few more process observations and learning points - primarily intended for Connected by Data’s own thinking on future dialogue design, rather than intended in any way as a critique of this particular process.
Assorted observations and reflections on dialogue process
I have fairly strong views on participation design that I should put up front: generally the more concrete the issue or situation that discussion can focus on, and the more ‘moving parts’ of that issue/situation that can be made legible to participants, the more meaningful the discussion is likely to be. And the more that points made in a discussion can be grounded in relatable lived experiences, the more powerful the messages from a discussion are likely to be.
However, I have less experience with specifically public dialogue methods, which I understand come from a slightly different place of decision-shaping, rather than decision-making public participation.
These biases stated, a number of things struck me as I was observing the NHS AI dialogue:
- Participants were introduced to imaging data, and a hypothetical database, however data was mostly discussed in the abstract, whereas the upcoming research competition is focussed on imaging data. The authors of the report shed light on views towards imaging data when it was specifically referenced by participants (i.e., that it was assumed to be harder to manipulate and that it might be less identifiable compared to other data). Given different kinds of data have different implications, getting as specific as possible about the kinds of data being governed seems important. Indeed, there might be quite different issues arising for the public about imaging data being shared, than those that relate to mental health records, for example.
- It seemed hard for the groups to keep in mind each of the nine different models of data governance presented. It struck me that these might have benefited from more sophisticated visual presentation (sketch of each, as opposed to simple ideograms) to help with ‘seeing’ the difference between them. I also wondered to what extent it would be useful to present more of the building blocks of different data governance approaches, rather than exemplar models, to allow groups to explore how different building blocks should fit together.
- Breaking into groups, and rotation between groups, was important to allow a diversity of voices to be heard, but I was left wondering if there needs to be a clearer theory on the role of groupwork in dialogue. Groups were often ask “What do ‘we’ think?” in ways that felt like it was setting up a presupposition of a group view, rather than allowing, eliciting or supporting productive disagreements about how data should be governed.
- It seemed that the sensitivity of personal health experiences meant that lived experience was treated very cautiously (no pressure to draw upon or share it), yet lived experience (and not just ‘patient’ lived experience) is a really important element of what public dialogue can bring in.
- Even by the end of the dialogue, participants appeared to have very limited understanding of what ‘AI’ was about (lots of personification of ‘The AI’ and confusion over where in the picture AI was being applied) and some commented on how ‘baffling’ AI was. I suspect there would have been value in having more of a worked example presentation from an AI practitioner about how they get, use and report on their use of data to build AI models.
- Equally, there was limited stimulus material on the concerns that academic work has raised around AI. Issues raised on presentation slides covered data breaches or rogue researchers, but didn’t talk about issues of problematic bias even in well-intentioned AI models, or the trade off between investing in AI systems vs. another form of clinical support. My sense is that critical expert inputs around this might have led to a much richer set of discussions at points.
- The main output of the process appears likely to be a set of principles. I was left wondering how different the principles are from those that dialogue commissioners might have ‘guessed’ at in advance. This would provide insight into how far the dialogue has developed, or re-confirmed, approaches to data governance.
I should also note the really exemplary practice I observed of including participants, with the Ipsos facilitation team working incredibly hard to make sure everyone had materials, good Zoom access, and opportunities to speak up. A really high level of preparation and support in place was evident, and I certainly noted a lot of good practices to learn from.
This was, in a sense, a dialogue on dialogue: seeking public views on how public views should feature in future health data governance. As such, it’s been useful to see both public enthusiasm for collective data governance models, and to see some of the challenges that meaningful collective dialogue around detailed data topics needs to overcome.
As we continue to develop our case database of examples of collective data governance, we hope to be able to surface more on both the building blocks, and comprehensive models, that are being deployed in all sorts of contexts to give those affected and connected by data greater voice in how it is collected, shared and used.