What can other (non-Barometer) secondary datasets tell us about data values?
This is the second post in a series produced as part of the analysis for the Measuring Data Values Around the World project.
We have previously scoped out how existing primary data collected from the Global Data Barometer might map to the Data Values framework. As a multi-dimensional composite index, the Global Data Barometer is based on both primary and secondary data sources.
In this post, we consider if there are elements of data values measurement which could be addressed by drawing on existing secondary indicators or by incorporating additional secondary data sources. These could feed into future iterations of the GDB, or be used in Data Values measurement products, tools or analysis based partially on the GDB.
For inclusion in the GDB secondary data had to pass a number of tests:
- Broad global coverage. Where secondary indicators do not cover every country covered by the Barometer, a reasonable method for estimating or imputing missing data should be available.
- Robust data collection method. There should be a clearly documented methodology for the indicator(s) and evidence to show how data quality is assured.
- Sensitivity to differences between countries. All indicators are re-scored on a 1 - 100 scale for inclusion in the GDB. There should be a clear approach to do this which ensures differences in fact between countries are reflected by reasonable changes in the indicator score.
- Stability. We should be reasonably confident that the indicator will continue to be produced, and updated, without disruptive methodological changes, in future years, to support time-series analysis of the GDB.
- Available and openly licensed data. The indicator data should be easy to access, and should be provided under terms that permit its inclusion, and ideally redistribution, as part of the GDB.
In the table below we’ve identified a number of potential secondary data sources and indicators that could either feed into specific Data Values country insight tools (DVT) that draw on the GDB, or that could be used to expand how far future editions (2ndEd) of the GDB address data values
In this notebook we’ve tested our ability to gather and draw on this data.
Following assessments, we have made use of the following datasets in the working prototype of a data values country insight tool:
Source | Description | Potential use |
SDG Indicator Database (Notebook) | Contains SDG data reported by countries, and can be used to identify the degree of disaggregation reported on key indicators | DVT & 2ndEd: Proxy to identify cases where countries may be collecting disaggregated, and intersectional, data (Manifesto 1 > Outcome metric) |
Statistical Performance Indicators (Notebook) | Built from secondary indicators to address data infrastructure, sources, products, services and use. Certain valuable fields (e.g. funding and investment in data) are defined, but not populated due to current data gaps. Future editions of SPI may include this data. | DVT: Provides evidence on quality of statistical infrastructure (Manifesto 1 > Foundations) and availability of data services (Manifesto 4 > Outcomes) |
Global State of Democracy Indices (Notebook) | Contains constructs based on the Varieties of Democracy survey that address consultation practice, and openness of government to consultation inputs. | DVT: Evidence of general (non data-specific) consultation frameworks. (Manifesto 1 > Foundations); 2ndEd: Governance or Capability pillar (considering ‘public participation around data’ as a capability to be measured) |
Open Data Inventory (Notebook) | Based on assessment of availability of key datasets from national statistical offices, including measures of coverage and disaggregation (customised to relevant disaggregations or coverage criteria for each indicator) | DVT & 2ndEd: Proxy to identify cases where countries may be collecting disaggregated, and intersectional, data (Manifesto 1 > Outcome metric) |
We also explored potential further sources, as documented in the notebook, but these do not feature in the prototype or suggestions for future GDB secondary data.
The next post in this series will document the prototype tool we’ve created to explore use of this secondary data alongside data collected by the Barometer.