Stats NZ has a new website.

For new releases go to

As we transition to our new site, you'll still find some Stats NZ information here on this archive site.

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Non-Sampling Error - Interviewing (Int) and Respondent Error (RE)


Interviewing error

Interviewers often have targets to meet to encourage responses, but there is a potential risk that an interviewer may paraphrase or rush the respondent so that they can interview more people in a given time frame. This will certainly lead to poor data quality if the respondent is rushed or the interviewer makes an error in their recording. If the interviewer has strong opinions and shares them or forces them on the respondent, this may lead to the respondent changing their answers to suit the interviewer. Or, if an interviewer asks a question in a slightly incorrect way, a respondent may answer differently than they would have answered a paper form. The bias caused by this would most likely be systematic, as all the respondents for a given interviewer would respond similarly. Ideally, we want the interviewer to avoid asking questions, whether intentionally or not, that are different to what is on the questionnaire. A respondent should answer the same way irrespective of the interviewer they get, and ultimately regardless of whether they filled in a paper form or were interviewed.

This source of error primarily occurs in face to face interviews, although in the current environment of economic surveys it may occur in telephone follow-ups or additional data collection on the phone. The interviewer could potentially direct respondents toward wrong information or information that is not quite what is actually required. Currently this is addressed by having standardised procedures for dealing with respondents and through providing interviewing training (both initial and ongoing) that ensures all staff meet an equally high standard. We will continue to be aware of this issue in the future, but in the current environment in which our economic surveys are run, primarily as postal surveys, this is not a major problem.


Respondent error

Information required not available to business

When we tailor the questions at the questionnaire design stage, it will reduce the error created when a business does not have the desired information available, but where that is not possible the problem may still occur. The more types of questionnaires there are for a given survey, the more complex and time consuming the questionnaire management and processing becomes. Furthermore, introducing too much routing within a single questionnaire can add to the length of the questionnaire, and even if questions do not need to be answered it can be distracting to have to read through them all. For these reasons we can end up asking questions that do not relate to all businesses.

If the information is not available to the business within its existing reporting systems this can lead to item non-response with the impact that more imputation is required.

Overall response rates may also be lower if respondents are put off by questions they can not answer, or by the added length of the questionnaire. Our current procedures are focused on addressing the problem once it has occurred. Therefore answers are either imputed using similar survey data or supplied from another source, for example, GST data, or pay as you earn (PAYE) data.

To directly address this problem in the future we are always looking for further types and quality of information available from other sources. Although we currently may be able to use another source as a preference to imputation for finding the answer for a specific business, the source must be of consistently high quality before we can rely totally on that source in place of the survey question (for more information see Data collected using other sources). In addition we will continue, when appropriate and when there is significant benefit, to use different questionnaire types for businesses with different characteristics.


Respondent misunderstanding

Respondent misunderstanding could potentially be one of the largest sources of non-sampling error in our surveys; however it is very difficult to quantify accurately. There can be a number of reasons for misunderstanding(s) occurring, but the primary cause will be because we have not made the collection process as easy and clear as it needs to be.

The impact is quite serious, as this can lead to poor data quality when a response is provided, or lower response rates when a respondent just does not respond at all. This affects the level estimates for the current period and potentially also the movements from period to period.

Currently after the survey has been sent out our Respondent Liaison Division is available to assist respondents with the questionnaire. Once the questionnaires have been collected we have a number of edit checks in place to look for specific areas of misunderstanding. Normally if there is a problem a pattern will show up for a specific question: either a lot of responses will be missing or they will not be what we require. This is what we expect to identify in pilot tests (see Questionnaire testing), but if it does still occur, we will note this for future improvement, and most likely follow up respondents to elicit the required response.


Respondent misreporting

Misreporting occurs primarily when the respondent feels they are protecting their own interests or privacy, or when the questions are too sensitive or confidential. The misreporting may occur as a result of preconceptions or lack of knowledge about Statistics New Zealand, its procedures and obligations to protect the privacy of all individuals and businesses. Misreporting may occur unintentionally if the respondent frequently answers the questionnaire and develops their own interpretation or short cuts, which may not match the information needs. Misreporting may also occur unintentionally if the respondent is new, or perhaps not familiar with all sections of the business. Just as with respondent misunderstanding, the impact of misreporting is quite serious, as an inaccurate response can lead to poor data quality and ultimately lower response rates when respondents do not respond at all. This affects the level estimates for the current period and potentially also the movements from period to period.

Currently we rely on edit checks at the processing stage to pick up misreporting, for example, answers that are very strange or do not match up with other answers on the form (for example, high income but very low expenditure). In the short term we can introduce input and output edit checks to determine whether answers are consistent, and recontact respondents for confirmation of values that appear inaccurate. However this is a limited solution to the problem and will not address all of the misreporting that occurs. We are always available to assist respondents with the questionnaire, but again, our goal is to increase the use of the help desks or our website.

Two major actions will improve control of this source of error. The first is to make more time available at the questionnaire testing stage to focus on and determine the sensitive or confidential questions, making amendments as appropriate. In addition, unique reporting may be addressed by looking for differences in the data between similar establishments and by routine cognitive testing of existing respondents. The second major action will find ways, possibly as part of other wider scoped projects, to increase the level of education and awareness that respondents have with regard to Statistics New Zealand so they can learn about the critical importance that we place on the security and confidentiality of all individuals' and businesses' information.


Respondent load

If a respondent is in too many surveys, or has been in a survey for a long time, they can feel overloaded and are less likely to respond. Some long-term respondents may still fill in the questionnaire but do so by picking the easiest route, for example saying 'no change' in employment level, which in turn allows them to skip other questions. There is a trade-off between the length of the survey and the quality of the data received. If the survey is too long, the respondent's load may lead the respondent to give poor data.
The direct impact is that overloaded respondents are less likely to respond to parts of the survey or will provide poor quality responses. They may even not return the questionnaire at all, leading to lower response rates. This also has a cumulative impact on multiple surveys if the respondent is selected in more than one, or gets selected into another at a later date.

Currently our use of the random number line for businesses on our Business Frame means that the overlap between surveys is minimised as much as possible, since we know what range of businesses are selected for each survey and therefore try to keep these mutually exclusive. However large key businesses will still end up in multiple surveys due to their importance, and there will be other exceptions too. The use of other sources for data, for example, using tax data for some businesses to update our Business Frame, means that we do not have to send out an update questionnaire, clearly reducing respondent load. Tax data is also starting to be used for small businesses in some of our sub-annual surveys such as the Quarterly Manufacturing Survey.

To reduce respondent load, regular redesigns of surveys are carried out that allow new ranges of the random number line to be selected, resulting in some businesses no longer being surveyed. A business may also be able to directly provide us with accounts/spreadsheets containing the required information.

In the future we intend to use the 'average time to complete the questionnaire' as a formal indicator of respondent load, and develop other indicators for predicting and anticipating the overall load of businesses. These could be based on, for example, how many surveys are done, and the time taken to complete the questionnaire. We will also target questionnaires with high load, including raising their priority for a complete redesign, make greater use of other sources for data and carry out research on the value of implementing rotation and re-selection strategies into our surveys.


Inconsistent answers given over time

Inconsistent answers given on questionnaires are inherently linked to respondent misunderstanding and misreporting, but there are other very specific reasons for inconsistent answers being provided. For example, if the staff member assigned to filling out the questionnaire changes, or the responses are given by a number of people for any one period, the responses can be quite inconsistent. This is most likely to occur when the answers provided are estimates, rather than exact figures. The impact of inconsistent answers will be that level shifts in the values individual businesses report can occur, which are not reflecting the actual situation. These shifts are likely to be random, as some will be increases, and others will be decreases, but it could potentially result in, for example, an industry trend being less significant than it really is.

Our current procedures involve edit checks and manual checking where processing staff may notice sizeable changes that have not been explained. Other initiatives, such as always having the same person processing a particular industry (or sub-industry), results in inconsistent answers being more easily picked up. Once identified, the unexplained response(s) are then confirmed with the respondent. While we cannot control who completes the questionnaire, we can encourage more contact with our help desks, which will in turn encourage more interest in providing correct values. As noted above, incorrect values occur more frequently when answers are being estimated.

Increased contact with businesses will help us to learn where the difficulties in reporting values are. We can then improve our questionnaires accordingly to make it easier to answer with the correct value (rather than an estimate). In these ways we should be able to further reduce the effects of inconsistent answers being given over time.


Respondent interruption

A respondent may try to fill in the questionnaire quickly, or fill it in over a period of time around other tasks and work requirements. This could mean that they lose their place or have to re-calculate answers. This may lead to mistakes being made, or result in item non-response or incorrect routing being followed, with the impact of increasing imputation requirements. The impact will be poor data quality.

Our current procedures involve re-contacting respondents when incorrect routing has been followed or a mistake appears to have been made, but this will not identify all errors. Having shorter questionnaires will minimise the time required and reduce this source of error, but this is a difficult task that conflicts directly with many other goals, especially in terms of the data that we need to collect. However, in the future a respondent may be able to choose when and where they respond (see Incorporation of new technologies), and this should also contribute to a reduction in this source of error.


Mechanical errors made by respondents

Mechanical errors are often 'clerical' in the sense that they can simply be the result of someone making a mistake in a calculation. They could also occur from not including GST in a value, or writing a value in exact dollars when the required value is thousands of dollars. While this could be due to misinterpretation it occurs most probably because the respondent has their value(s) and just writes it in as is. Although an error may be the same over time for a given respondent, the effect over all the survey results should be random.

Currently we have procedures in place to re-contact a respondent if it appears that an error has occurred. We also have edits in our processing system to try to detect mechanical errors. For example, if all responses for a question were in NZ$ millions and one came through as NZ$ billions, the edit will pick this up and may even automatically fix it. This might have occurred if the respondent gave the actual value when the question asked for value in thousands of dollars, therefore resulting in three extra zeros being added to the value at initial processing.  

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+