Stats NZ has a new website.

For new releases go to

www.stats.govt.nz

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
First Steps to Controlling Non-Sampling Error

Controlling non-sampling error

How do we control an error that can not be measured directly?

Although we cannot directly calculate a numerical measure of its effect, there are a number of steps we can go through in order to control non-sampling error. The first involves investigating the entire survey process and understanding what the sources of error are and where in the survey process the errors are occurring. At Statistics New Zealand this is an ongoing process, continually building on the understanding that has been developed over time.

Some sources can directly cause error, as for example in data capture, while other sources can indirectly cause error, for example non-response. Although it is useful to know whether the problem causes the error directly or indirectly, in these pages we will refer always to each as simply 'a source of error'. Once the source of error has been identified, the types of error need to be determined. This is useful to know because it will assist in the way in which the error is controlled, and indicate the likely impact on the data quality. Finally, it is useful to identify the potential impact the source of error will have, and therefore determine how important (relative to the other known sources of error) it is to effectively control the source of error.

We still need some way to actually identify if an error still exists or if, through new procedures, we have actually reduced the effect of the error. Although we cannot calculate a direct numerical measure of its effect, we can almost always get an indicator, such as the time taken to update a frame, or the achieved response rate in a survey, or even feedback from respondents. We can then compare the indicator over time to look for changes, for example, a reduction in the time taken to update a frame or an increase in the positive feedback from respondents. This process of comparison therefore allows us to control, in an effective way, the sources of non-sampling error.

top 

What are the two major types of non-sampling error?

Non-sampling error can be grouped into two main types: systematic and random. Systematic error (or bias) makes survey results unrepresentative of the target population by distorting the survey estimates in one direction. Random error can distort the results in any given direction but tend to balance out on average.

An example of systematic error

Non-response to a survey is a good example of something that can cause systematic error. Hopefully the businesses which did not respond will be a random subset but suppose however, that the businesses which did not respond were smaller businesses in a particular type of manufacturing group. If we were estimating manufacturing sales, our estimate for this group may be biased toward a larger value than that of the true value.

The key point here is that there is a systematic non-sampling error that we can not measure directly (although we may have an indirect indicator that points to the existence and likely impact of the error).

How do we control this systematic non-sampling error? We can try to increase the response rate. We still cannot measure the non-sampling error caused by those who do not respond but, since this group is much smaller, any bias that exists will have less effect. Note that in this case, our indicator is the response rate, so by increasing the percentage of businesses responding to the survey, we know that we are reducing the potential for this source of error to impact on our results. We have therefore controlled for an error that we can not measure directly.

top

An example of random error

An error in data capture seldom occurs, but it is a good example of something that can cause a random error (a manual data capture error could occur systematically but generally will almost certainly be random). We call it random error because it is unlikely to have any pattern, and would most likely average out to zero, over many occurrences of the error. In other words, the error will most likely balance out, with values for some businesses recorded too high, others recorded too low. Again, we do not know what the magnitude of the effect of this error is. However, we can make improvements to our processing system to reduce the likelihood of errors remaining undiscovered.

Typically Statistics New Zealand will have in place a data edit system. This will mean that as data is entered, the values are checked to see if they fall within expected ranges. These expectations might be set by analysing previous surveys or by common sense and subject matter knowledge. By double checking data that fails these edits we can reduce the number of random data capture errors that pass through to the final dataset.

We can use the number of failed edits as an indicator of the control we have over the data capture phase. By monitoring rates of edit failure we can effectively target efforts to improve quality of this area. Again, we have controlled for an error that we cannot measure directly.

top

Structure of the pages discussing the sources of non-sampling error

The sources of error discussed in the following detailed pages are grouped firstly by their place in the survey process and then by the area in which they occur. The survey process has been split into three major phases: the initial design phase, measurement phase and inference phase. Each of these has a brief introductory page, and then a summary of the areas in that phase. The sources of error are then discussed under each main area. For example, questionnaire design is one area in the initial design phase. A number of sources of error are then presented and discussed on that page.  

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+