Four Factors Determine Survey Statistical Confidence
- Size of the population. The “population” is the group of interest for the survey. The bigger the population, the smaller the response rate percentage needs to be — but it not linear.
- Segmentation analysis desired. Typically, we analyze the data along some demographic segmentations, for example, region, annual sales volume, or support representative. If the critical decisions will focus on the analysis of these segments, then the statistical accuracy needs to be calculated for these smaller segments and not just for the population.
- Degree of variance in responses. This factor is the hardest to understand. If the responses tend to be very similar, then we don’t need to sample as much to get the same accuracy as we would if the responses range widely. Imagine you polled your office colleagues, and the first five people gave the same answer. Would you continue polling? Probably not. What if you got five different responses? You’d probably keep polling. More variability requires larger samples. But until we do some surveying, we don’t know anything about the variance! So, initially, we employ conservative assumptions about the variance.
- Tolerance for sampling error. How accurate do you need the results to be? If you’re going to make multi-million dollar business decisions, then you need better accuracy.
But What Survey Response Rate Do I Need?
The statistical equations here are a bit daunting. (They can be found in most any statistics textbook under “sample size.”) Years ago when preparing a conference presentation, I created the nearby chart. While still complicated, it’s a lot more understandable than the equations!
The horizontal axis shows the population. The vertical axis show the percentage of the population from whom we have a response. (This is not the response rate. The response rate is the percentage of those receiving an invitation who respond — an important critical distinction.)
The chart shows seven lines that depict seven levels of survey accuracy. The first one is the horizontal line at the top. If we survey the population and everyone responds, we are 100% certain that we are 100% accurate. We have population parameters. Of course, that will likely never happen.
Let’s say our questions use an interval rating scale that ranges from 0 to 5 (so there are 5 equal intervals in the scale). If we get enough responses so we’re on the +/-10% curve, then we’re pretty certain the “real” answer (the population mean) lies no more than 10% from the mean score we got(the sample mean), which is half an interval (0.5) on either side of the mean.
More properly, if we conducted this survey 20 times, 19 out the 20 times (95%), we would expect the mean score to lie within +/-10% of the mean score found when we conducted the survey.
Take a deep breath and re-read the above…
Say you have a population of 1000, and you sent a survey invitation to 500 people. Half of those responded. So, 25% of the population responded. Find the intersection of 1000 on the horizontal axis and 25% on the vertical axis. You would be approximately 95% certain of +/-5% accuracy in your survey results.
Conversely, if we have an accuracy goal for the survey project, we can use this chart to determine the number of responses needed. Say, we have that population of 500, and we wanted an accuracy of +/-10%. Then we would need about 18% of the population to respond, or 90. (Find those coordinates on the chart.)
By applying an estimate of our response rate, we can then determine the number of survey invitations we must send out, which is our sample size. If we estimated a 25% response rate, then we would need a sample size of 360. (360 x .25 = 90)
Take another deep breath and re-read the above…
When we actually conduct our survey and analyze the results, we will then know something about the variance in the responses. The confidence statistic incorporates the variance found in each survey question and can be calculated for each survey question.
The confidence statistic tells us the size of the band or interval in which the population mean most likely lies – with a 95% certainty. (Technically, the interval tells us where the mean of repeated survey samples would fall. With a 95% certainty, 19 of 20 survey samples drawn from the population of interest would lie within the confidence interval.)
Look at the above chart. It shows the calculation of the confidence statistic using Excel. Alpha is likelihood of being wrong we’re willing to accept. (.05 or 5% being wrong is the same as 95% certainty we’re correct.) The standard deviation is square root of the variance, and can be calculated using Excel. Size is the number of responses.
In this example, the mean for the survey question was 3.5 on a 1 to 5 scale and the confidence statistic was 0.15. So, we’re 95% certain the true mean lies in a band defined by 3.5 +/-0.15. Our accuracy is 0.15 as a percentage of the size of the scale, which is 5-1=4. Thus, our accuracy is +/-3.75%.
Let’s close by dispelling the myths in the opening three quotes.
- The response to a direct mail campaign is completely and utterly irrelevant to the statistical accuracy of a survey. In other words, beware of whose advice you take! (People who know marketing don’t always know about surveying or statistics.)
- There is no national average response rate. A whole host of factors drive response rates.
- Response rates and statistical accuracy are not linear.
- Finally, 30 responses would provide acceptable accuracy only if a) you have a very small population, b) you have very little variance in the responses, or c) you are willing to accept very low accuracy. As a very rough rule of thumb, 200 responses will provide fairly good survey accuracy under most assumptions and parameters of a survey project. 100 responses are probably needed even for marginally acceptable accuracy.
Everyone conducting a survey is concerned about response rates and the accuracy for their survey results. This is one part of a survey project that does require some fundamental understanding of statistics. In our Survey Design Workshop, we spend considerable time on this topic.
You don’t need to be a statistician to do a survey, but you do need some understanding of statistics to be sure you’ve got data that sets a firm basis for decision making.