Stats NZ has a new website.

For new releases go to

www.stats.govt.nz

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Introduction to assessing administrative data quality

Purpose

This guide presents a framework for understanding how well different datasets meet their intended purpose, including their strengths and limitations. It also explains how to determine what effects these strengths and limitations may have on the quality of a statistical output that uses administrative data, survey data, or a combination of the two.

Quality assessments carried out using this error framework should help answer the questions arising from Statistics New Zealand's push to be an 'administrative data first' organisation: how do we decide which administrative data should be used for which purposes, and how can we be sure that direct surveying is not necessary?

Measuring data quality

No statistical dataset perfectly measures exactly what we want it to. At present we cannot provide a single generic measure to summarise data quality, but this guide's error framework can produce a comprehensive list of the strong and weak points of datasets and outputs. Instead of judging a dataset as 'good' or 'bad', the framework identifies the strengths and weaknesses of a dataset in an objective way, with reference to its original purpose. Such analysis can guide design decisions and ensure we collect the right amount of data to produce fit-for-purpose outputs.

The framework facilitates reusing both existing data and previous quality assessments.

Structure of this guide

The first part of the error framework focuses on how well a dataset meets its original, intended purpose – useful information when wanting to investigate whether the data can meet other needs. We hope the framework provides a common language for talking about data quality issues, and is a valuable decision-making resource for the organisation.

The second part addresses problems that can arise when combining datasets from different sources (eg transforming raw variables to match statistical needs and identifying and creating statistical units from integrated datasets). The outcome of such an assessment is useful to test different design options, or to identify quality risks that need to be mitigated or checked over time to ensure the consistency of the resulting statistics.

This guide also supports measures and indicators to quantify key aspects and concerns of data quality in a detailed way. While these measures do not cover all situations, they give ideas for more detailed or technically complex measures that could be developed for a specific output.

See Quality indicators for phase 1 [and 2] errors in the ‘Available files’ on this webpage.

The framework in this document cannot solve quality problems on its own, but it will highlight aspects of the datasets most in need of further work – so investigations can focus on the most crucial quality issues. 

Contents of this guide

The main parts of this guide are:

  • An explanation of the error framework. This details the framework and describes how to apply it to different datasets and outputs.
  • A practical example (see section 5) to help explain how to use the framework. The tables in this example provide useful templates for other assessments.
  • Our current plans for implementing the error framework, and future work.
  • The metadata information template. This Excel spreadsheet (see ‘Available files’) to use to capture the key information about datasets being assessed.
  • Detailed lists of quality measures. The two quality indicators files (see ‘Available files’) list indicators and measures categorised by error type. Select and use the ones most relevant or useful to assess a specific dataset.

A glossary at the end of this guide defines the terms we use in this guide. Use this alongside the other documents, and also as a guide for which terms to use when writing up a quality assessment or report on work done using the error framework.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+