The importance of Good Survey Question Wording: Even Pros Make Mistakes

Proper survey question wording is essential to generate valid, meaningful data for organizational decisions. A survey question wording bias will lead to misleading interpretations and bad decisions.

Good question phrasing is an art form, and even the pros can make mistakes. Here we’ll show a question wording bias example from a survey done by Pew Research Center. Ambiguity in question wording likely led to incorrect data and conclusions. It provides some useful lessons for all survey designers.

Impact of Mobile Surveys — Tips for Best Practice

Summary: Mobile survey taking has shot up recently. When a survey invitation is sent via email with a link to a webform, up to a third of respondents take the survey on a smartphone or equivalent.  Survey designers must consider this when designing their surveys since some questions will not display properly on a mobile device.  This article presents the issue and some tips for good practice.

~ ~ ~

If you do any kind of design work that involves information displays on webpages, you know the challenges of designing when you don’t know the screen size that will be used. It makes you long for the days of green screens and VT 100s when the mouse was something that your cat killed and a trackpad was somewhere you ran.

Screen sizes and resolution always seem to be in flux as we’ve moved from desktops to laptops, netbooks, smartphones, tablets, and phablets.

As part of rewriting my Survey Guidebook, I have been talking with survey software vendors, and the dramatic rise of the survey taking via smartphones is a big issue. Roughly around one quarter to one third of survey submissions are coming via smartphone devices. Ouch!

survey-monkeyThe issue is: how does your survey render onto a smartphone? Website designers have tackled this with responsive website design, and the same issues apply for surveyors. But the issue is perhaps more critical for surveyors.  While the webforms might be responsive to the size of the screen on which the survey will be displayed, the fact is some survey questions simply will not display well on a small screen.
For example I love — or loved — to set up my questions in what’s called a “grid” format (sometimes called table or matrix format). It’s very space efficient and reduces respondent burden.  However that question layout does not work well, if at all, on a phone screen, even a phablet.

Look at the nearby screen shot from a survey I took.  Notice how the anchors — the written descriptions — for the four response options crowd together. Essentially any question where the response options are presented on horizontally may not display well.

The webform rendering may have implications for the validity of the data collected.  If one person takes a survey with the questionnaire rendered for a 15 inch laptop screen while another person takes the “same” survey rendered for a 5-inch smartphone screen, are they taking the same survey?

We know that survey administration mode affects responses. Telephone surveys tend to have more scores towards the extremes of response scales. Will surveys taken by smartphones have some kind of response effects?  I am not aware of any research analyzing this, but I will be on the lookout for it or conduct that research myself.

So what are the implications for questionnaire design from smartphone survey taking?

First, we have to rethink the question design that we use. We may have to present questions that display the response options vertically as opposed to horizontally.  This is a major impact.  If you are going to use a horizontal display for an interval rating question, then you should use endpoint anchoring as opposed to having verbal descriptors for each scale point.  Endpoint anchoring may allow display of the response scale without cramped words. But make sure you have constant spacing or you’re compromising the scale’s interval properties!

Second, we have to simplify our questionnaires. We need to make them shorter. A survey that may be perfectly reasonable to take from a time perspective on the laptop with a table display will almost certainly feel longer to complete on the phone because of the amount of scrolling required.  While smartphone users are used to scrolling, there must be a limit to people’s tolerance.  A 30-question survey on a laptop might take 3 to 6 screens but take 30 screen’s worth of scrolling on a smartphone.  You might be tempted to put one question per screen to limit the scrolling.  However, the time to load each screen, even on a 4G network, may tax the respondent’s patience.

Third, beyond question design we should reconsider the question types we use, such as forced ranking and fixed sum.  Those are especially useful for trade-off analysis to judge what’s important to a respondent.  However, they would be very challenging to conduct on a small screen.  So, how do we conduct trade-off analysis on a smartphone?  I’m not sure.  Probably using a multiple choice question asking the respondent to choose the top two or three items.

Fourth, extraneous verbiage now becomes even more extraneous.  In my survey workshops I stress to remove words that don’t add value. With smart phone rendering, it becomes absolutely essential. Introductions or instructions that would cover an entire screen on the laptop would simply drive away a smartphone respondent. Even the questions should be as brief as possible as well as the response options.  The downside is that we may be so brief as to not be clear, introducing ambiguity.

Fifth, the data should be analyzed first by device on which the survey was taken. Are the scores the same? (There are statistical procedures for answering that question.) If not, the difference could be due to response effects caused by the device or a difference in the type of people who use the each device.  Young people who have grown up with personal electronic devices are more likely to take surveys on a mobile device.  So are differences in scores between devices a function of respondents’ age or a function of the device and how it displays the survey (response effects)?  Without some truly scientific research, we won’t know the answer.

Not analyzing the data separately assumes the smartphone survey mode has no response effects. That could be a bad assumption. We made that same assumption about telephone surveys, and we now know that is wrong. Could organizations be making incorrect decisions based upon data collected via smart phones? We don’t know but it’s a good question to ask.

In summary, the smartphone medium of interaction is a less rich visual communication medium than a laptop, just as telephone interviews are less rich since they lack the visual presentation of information. If we’re allowing surveys to be taken on smartphones, we must write our surveys so that they “work” on mobile devices — basically the lowest common denominator.  Ideally, the survey tool vendors will develop ways to better render a survey webform for smartphone users, but there are clearly limits and the above suggestions should be heeded.

Survey Question Design: Headlines or Meaningful Information?

Summary: Surveys can be designed to generate meaningful information or to manufacture a compelling “headline”. This article examines the forced-choice, binary-option survey question format and shows how an improper design can potentially misrepresent respondents’ views.

~ ~ ~

The Wall Street Journal and NBC news conduct periodic telephone surveys done by a collaboration of professional survey organizations. The poll results released on August 6, 2014 included a central question on the “Views of America” regarding individual opportunity. I say “central” since only a few of the 29 survey questions were reported in the paper. Here’s what they reported regarding one question:

A majority of those polled agreed with the statement that growing income inequality between the wealthy and everyone else “is undermining the idea that every American has the opportunity to move up to a better standard of living.”

That’s a pretty startling finding, and this is a bit of a dangerous article to write since it uses a public policy survey as the example. My purpose here is not to argue a point of view, but to show the impact of good question design — and bad. I’ll show how a mistake in survey question design can distort what your respondents actually feel. Or, to approach it from the opposite angle, the article shows how a survey can be used to generate a false headline with seemingly valid data.

~ ~ ~

Let me walk you through the progression of the survey so that you can experience properly this particular survey question. Question 14 used a 5-point scale ranging from Very Satisfied to Very Dissatisfied, “I’d like to get your opinion about how things are going in some areas of our society today,” specifically,

  • “State of the US economy” — 64% were either Somewhat or Very Dissatisfied
  • “America’s role in the world” — 62% were either Somewhat or Very Dissatisfied
  • “The political system” — 79% were either Somewhat or Very Dissatisfied

The next two questions were each asked of half the respondents. “And, thinking about some other issues…”

Q15 Do you think America is in a state of decline, or do you feel that this is not the case?

— 60% believe the US is in a state of decline

Q16 Do you feel confident or not confident that life for our children’s generation will be better than it has been for us?

 — Only 21% “feel confident,” significantly lower than the trend line shown in the report

The next question presents two statements about America with the statements rotated as to which one is presented first to the respondents.

Q17 Which of the following statements about America comes closer to your view? (ROTATE)

Statement A: The United States is a country where anyone, regardless of their background, can work hard, succeed and be comfortable financially.

Let me pause here and ask you to think about that statement. It’s a motherhood-and-apple-pie statement about America. If you disagree with any of that statement, you are likely predisposed to agree with the subsequent statement. And note that the previous questions have elicited quite strong negative opinions. I’ll come back to that effect.

To continue with Question 17…

…or…

Statement B: The widening gap between the incomes of the wealthy and everyone else is undermining the idea that every American has the opportunity to move up to a better standard of living.

Results from Question 17 on Views of America. 54% agree that “Widening gap undermining opportunity”.  (That’s how the pollsters described it in their PDF summary.) 44% agree that “Anyone can succeed”. 1% of respondents volunteered that both statements were true, and 1% volunteered that neither were true. Those options were not offered overtly. For those readers with good addition skills, 2% of the respondents are unaccounted. Perhaps that’s rounding, but the pollsters don’t say.

Sequence Effects upon Question 17. Unfortunately, the pollsters do not report the splits depending upon which statement is presented first. That split could be very enlightening about the sequence effect in that question’s design. That is, are people more likely to chose Statement B if it’s asked first or second?

We also have a sequence effect in play from the previous questions. Would the results be different if Question 17 had been asked before Question 14? I suspect so since the previous three questions put respondents into a response mode to answer negatively. Rotating the order of the questions might have made sense here if the goal is to get responses that truly reflect the views of the respondent. They also don’t report the splits for Question 17 based upon whether Question 15 or 16 was asked of the respondent immediately prior.

wsj-nbc-news-poll

The Design of Survey Question 17. This question is an example of a binary or dichotomous choice question.  You present to the respondents contrasting statements and ask which one best matches their views. This format is also call a forced choice question.

The power of the binary choice question lies in, well, the forcing of choice A or B. The surveyor is virtually guaranteed a majority viewpoint for one of the two choices!! Pretty slick, eh? Look again at what the Wall Street Journal reported:

A majority of those polled agreed with the statement that growing income inequality between the wealthy and everyone else “is undermining the idea that every American has the opportunity to move up to a better standard of living.”

What an impressive finding! Did they tell the reader that respondents were given only two choices? No. Not presenting the findings in the context of the question structure is at best sloppy reporting, at worst disingenuous, distortive reporting. In a moment we’ll consider other ways to measure opportunity in America in a survey, but first let’s look at the improper design of this binary question.

Proper Design of Binary Choice Questions. When using a binary choice question the two options should be polar opposites of the phenomenon being measured. This is a critically important design consideration. In this question, the phenomenon is — supposedly — the state of opportunity in American today.

But note the difference in construction of the two statements.

  • Statement A says the American Dream is alive and vibrant.
  • Statement B says the American Dream has been “undermined” and presents a cause for the decline.

So, if you feel that opportunity has been lessened, you’re likely to choose Statement B even if you don’t feel income inequality is the cause. In the end you’re agreeing not only that opportunity has been reduced, but also to the cause of it.

The astute reader may say the question doesn’t directly assert a cause-and-effect relationship between income inequality and personal opportunity. It says “the idea” of equal opportunity has been “undermined.” This question wording is purposeful. It softens the assertion of a cause-and-effect relationship, making it easier to agree with the statement. Will those reading the findings, including the media, see that nuance? No. The fine distinction in the actual question will get lost in the headline. Just look at the shorthand description that the pollsters used: “Widening gap undermining opportunity”.

Alternative Survey Question Designs to Research Opportunity in America. The issue could have been researched differently. The pollsters could have posed each statement on a 1-to-5 or 1-to-10 Agreement scale and then compared the average scores. Those findings arguably would have been more meaningful since respondents wouldn’t be forced to choose just one. But would the findings have had the same pizzazz as saying “A majority of those polled…”?

Another design choice would be to present more than two options from which to choose. What if the statement options had been:

  • Statement A: The United States is a country where anyone, regardless of their background, can work hard, succeed and be comfortable financially.
  • Statement B: The widening gap between the incomes of the wealthy and everyone else is undermining the idea that every American has the opportunity to move up to a better standard of living.
  • Statement C: The country’s continued economic weakness is denying opportunity for a better standard of living even to those who apply themselves and work hard.

Without much trouble, we could develop even more statements that present equally valid and divergent views of opportunity in America. However, presenting more choices does cause a problem in survey administration.  For telephone surveys we are asking a lot, cognitively, of our respondents. They must listen to, memorize, and select from multiple statements. Each additional option increases the respondent burden, but could the telephone respondents have handled three choices? Probably, if the choices truly represent very different opinions as just presented.

A key advantage of paper or webform surveys is that the visual presentation of the questions allows for multiple options to be presented to the respondent without undue burden and with less likelihood of a primacy or recency effect — the tendency to choose the first or last option.

Something else happens when we present more than 2 choices. We’re much less likely to get a compelling headline that “A majority of those polled agreed…”

The Importance of Clear Research Objectives. Any survey project should start with a clear understanding of its research objectives. For this particular question the research objectives could be:

  • To see if people feel opportunity in America is weakening.
  • To identify the cause of weakening opportunity.

For this survey, that question’s research objective is really both — and neither.

Income inequality may be a legitimate topic for discussion, but is it the primary cause of the loss of opportunity to the extent that no other possible cause would be offered to respondents? I think you’d have to be a died-in-the-wool Occupy-Wall-Streeter to view income inequality as THE cause of any loss of individual opportunity. In fact, the Wall Street Journal‘s liberal columnist, William Galston, discussed “secular stagnation” in an article on August 26, 2014 never mentioning income inequality as a cause of our sideways economy.

How much different — and more useful — would the poll findings have been if the question had been presented this way?

Q17 Which of the following statements about America comes closer to your view? (ROTATE)

Statement A: The United States is a country where anyone, regardless of their background, can work hard, succeed and be comfortable financially.

…or…

Statement B: The opportunity for every American, regardless of their background, to move up to a better standard of living through their own hard work has weakened over the years.

(If respondent chooses Statement B. Check appropriate statement from list below.)
What do you see as the two primary reasons for the drop in individual opportunity?

Income inequality
Weak economy means fewer jobs available
Weak economy means fewer career advancement opportunities
Increase in part-time jobs
Poor educational system
Good jobs are outsourced overseas
Government regulations deters business growth and thus job growth
Can’t get business loans to start or expand a business
Opportunity goes to those with the right connections
Discrimination in career advancement whether racial, gender…
Etc.

(Note that in a telephone survey, we generally do not read a checklist to the respondent. Instead, we pose an open-ended question, and the interviewer then checks the voiced items. Reading a comprehensive list would be tiresome. A webform survey could present a checklist, but the list must be comprehensive, balanced, and unbiased.)

Do you thing a majority — or even a plurality — of respondents who chose Statement B would say income inequality was a primal cause? I very much doubt it.

Wouldn’t this proposed question structure present a better picture of what Americans see as limiting opportunity, more fully answering the research objectives listed above?

But would the headline be as compelling? Pardon my cynicism, but the headline is probably the real research objective behind the question design.

~ ~ ~

Flawed Question Design in Organizational Surveys. Now you know why the forced choice, binary format is liked by those who want to generate data to argue a point of view. Is this restricted to political polls? Of course not. In an organizational survey measuring views of customers, employees, members or suppliers, we could accidentally — or purposely — word the choices to direct the respondent to one choice.

For example, imagine this binary-choice question for an internal survey about a company’s degree of customer focus.

Statement A: All our employees treat our customers in a manner to create and foster greater loyalty.

Statement B: Recent cuts in staff mean our customers are no longer treated by our employees in a way that will increase their loyalty.

Or a survey on member preferences for some group

Statement A: The association’s programs provide real value to our organization.

Statement B: The content offered in the associations programs doesn’t meet our requirements.

Notice how the question structure here parallels the question in the poll. A novice in survey design could stumble into the error in the above questions’ design. Or the person could want to “prove” that staff cuts are endangering customer loyalty or that content is the problem for members’ disaffection.

Any survey designer worth his salt can prove any point you want through a “professionally” designed and executed survey. And someone designing surveys for the first time needs to be aware of the importance of proper survey question design to generating meaningful, useful data.

Have You “Met” Mayor Menino? Lots have!

Sometimes even well honed survey questions don’t measure the underlying attribute that the survey designers want. A recent poll by the Boston Globe about Mayor Menino’s popularity shows a clear example of this. The question attempted to objectively measure how many people the Mayor has met, but the results — when thought through to the logical conclusions — show that the question was really measuring something else.

Survey Design Tips — SwissCom Pocket Connect Survey

When my wife and I spent the Christmas 2012 holiday in Switzerland, we rented a pocket “MiFi” unit, the SwissCom’s Pocket Connect, which gave us internet access wherever we traveled in Switzerland. (The product has since been discontinued.) Given the price it was a wonderful value, except that it did stop working. Support was lacking — though, it was Christmas day when the device stopped functioning — but in the end SwissCom did a good job of service recovery. I received a survey asking for my experiences, so I was interested in seeing if I’d have a chance to voice all my feelings, pro and con.

Below is my spontaneous, top-of-the-mind review of the Pocket Connect survey.

You can view the video directly on YouTube.

Comcast Chat Survey Review — Tips for Survey Design

I recently had a chat interaction with Comcast as I was trying to find the rates for international calls made from my office phone, which uses Comcast Voice. After the chat, I was asked to take a survey. In the YouTube video link below you will hear my review of the Comcast Chat survey.

You can view the video directly on YouTube.

HHS Hospital Quality Survey

Summary: The US Department of Health and Human Services (HHS) conducts a survey of patient experiences. The survey results are used to determine, in part, reimbursement by government to specific hospitals based upon the quality of care. But does the survey truly measure hospital quality? This article examines some of the administration biases and instrumentation biases that are present in the survey — to my surprise. In fact, the most important question has the most serious design shortcomings.

~ ~ ~

The front page article of the Wall Street Journal on October 15, 2012 reported on the role of a patient satisfaction survey used by the US Department of Health and Human Services (HHS) to measure hospital quality — “U.S. Ties Hospital Payments to Making Patients Happy.” This is not just a “nice to know” survey. The results are applied in a formula that determines the amount of reimbursement that hospitals get from HHS. Read here for a description of the survey.

In fact, the survey results comprise 30% of the evaluation. Not too surprisingly, hospitals whose reimbursements have been negatively impacted by the survey have questioned the appropriateness of using patient feedback to determine reimbursements. They have also questioned the methodology of the survey.

Given its impact, HHS treats the survey methodology with some care. Online, I found academic articles discussing the adjustments made to the results based upon the mode of survey administration — phone vs. web form. But upon looking at the survey instrument itself, I was surprised at some of the biases that are appear to be part of the instrument and the administration process.

Let me first address administration biases.

  • Recall bias. The follow-up reminder occurs 3 weeks after the initial invitation. That’s a long time lag. Even those who respond initially to a postal mail survey will have some recall bias, but those who respond based on the reminder note will have serious recall bias. It’s very likely that impressions of a hospital stay will change over time. Were adjustments made for when responded? I didn’t see that.
  • Lack of confidentiality. Very surprisingly to me, the cover letter contains no statement of anonymity or confidentiality. Indirectly, they do tell respondents that their submissions are not anonymous. The survey, and optionally the cover letter, states: “You may notice a number on the survey. This number is used to let us know if you returned your survey so we don’t have to send you reminders.”

USDH survey cover

Clearly, there’s no anonymity. I say “optionally” because the vendors approved to conduct the survey on behalf of hospitals apparently have some leeway in what is included in the cover letter. That fact struck me as odd since it affects the respondent’s mental frame when approaching the survey.

But more importantly to me, there’s no statement of confidentiality. We are told, “Your participation is voluntary and will not affect your health benefits” and “Your answers may be shared with the hospital for purposes of quality improvement.” Those are pretty thin statements of confidentiality. Who will get access to individual submissions? This is never fully spelled out, nor whether “your answers” that are “shared with the hospital” will include the respondent’s — the patient’s! — name. Yikes!

In the age of HIPAA (Health Information Portability and Accountability Act), I found this most odd. I do not believe I would complete this survey due to the response bias that this lack of confidentiality would create. While my “participation” may “not affect [my] health benefit,” it might affect how the hospital and staff treat me the next time I’m at the hospital!

Consider the rather vulnerable position in which patients find themselves at a hospital. Basically, you’re being asked to be a whistleblower — assuming you have negative comments — with no protection. Granted, it might be far-fetched to think that a hospital employee who got called out from a survey would mistreat the patient who gave a bad review, but if you’re in a vulnerable health situation, would you take the chance?

This response bias may lead people to not participate, and it may also color how they respond if they do complete the survey.

~ ~ ~

Now let’s turn to the instrumentation biases in the survey instrument itself.

The most puzzling question to me is the question that probably gets most used by HHS — Question 21. The last section for feedback is about your “Overall Rating of Hospital.” The instructions are quite explicit.

Do not include any other hospital stays in your answer.

As a transactional survey, I understand the need to have the respondent focused on this one event. But Question 21 then asks for a comparative answer.

Using a number from 0 to 10, where 0 is the worst hospital possible and 10 is the best hospital possible, what number would you use to rate this hospital during your stay.

Please tell me how you can answer that question without thinking about “other hospital stays.” You can’t. The question is asking for a comparison, and the instructions tell you that you shouldn’t compare!

hhs-survey-question-21I am frankly stunned that this type of wording error could be in a survey question that is so critical to so many organizations in our nation.

Further, there’s no option to indicate you cannot make a comparison, while other less critical questions allow the respondent to indicate the question is not appropriate or applicable to them.

Consider my situation. I have not had an overnight hospital stay since I had my tonsils out when I was 5 — many, many decades ago. I have been fortunate that any medical issues have only required outpatient procedures.

How do I answer the comparative question? I can’t without simply fabricating a benchmark based on general perceptions. How valid are my data? How would you like to be a hospital president whose reimbursement is reduced based upon such data?

Even if I had had other hospitals stays, wouldn’t it be analytically useful to know how many stays I’ve had in the past X years that form my comparison group? In fact, shouldn’t the comparison group be hospital stays in the past few years? (Maybe they have all that data in some database on all of us — a scary thought, but not out of the question in these days of “command and control” central authorities.)

For a question that is so critical to the overall assessment, it has serious shortcomings.

The survey instrument has other design aspects that don’t thrill me.

  • Most all the questions are posed on a frequency scale ranging from Never to Always. Use of this scale presumes that the characteristic being measured happens — or could have happened — numerous times.
  • hhs-survey-questions-12-13Questions 12 to 14 ask about controlling pain with medication. Question 13 again uses the “how often” phrasing — “how often was your pain well controlled?” implying pain is a discrete event, not a continuous event. Question 14 asks whether “the hospital staff [did] everything they could to help you with your pain?” Yet, as noted in the Wall Street Journal article, proper pain management does not mean providing any and all medication to relieve pain. The question conflicts with proper medical practice.
  • Questions 1 and 5 ask if the nurses and doctors, respectively, treated you “with courtesy and respect.” Those two attributes are very similar, but there is a difference, which is somewhat rooted in the patient’s cultural background. To some extent this is a double-barreled question, which clouds the ability to answer the question and to interpret the results.
  • hhs-survey-questions-14-15Question 4 asks, “after you pressed the call button, how often did you get help as soon as you wanted it?” (my emphasis) First, the question assumes that everyone knows what the “call button” is, which is probably safe, but “call button” is hospital jargon. We should avoid our industry jargon in surveys since they can introduce ambiguity.More importantly, there’s an implication in the wording that the call button system has the ability to differentiate the priority of the patient’s issue. Every call has to be answered as if it’s critical, but less critical calls could impede the ability to respond to more critical ones.They could have asked for a quasi-objective answer — “after you pressed the call button, how long did it take for someone to respond,” recognizing that people always over estimate wait times, especially in urgent situations. But instead they have introduced real subjectivity with the phrasing.
  • Question 15 asks if you were “given any medicine that you had not taken before?” The response options are Yes and No. The missing option is Not Sure. Nor does the question ask how many new medicines were taken. This distinction is important for Question 16 which poses on that frequency scale, “how often did hospital staff tell you what the medicine was for.” What if you took only one new medicine? The response then has to be either Never or Always. Again, for analytical purposes the count of new medicines in important.
  • Question 8 asks, “how often were your room and bathroom kept clean?” The use of the lead-in phrase “how often” implies discrete events, which is more appropriate for this alternative question wording: “how often were your room and bathroom cleaned?”

Albeit, some of these are minor issues, but many are not. In such an important survey, shouldn’t the instrument have as clean a design as HHS expects of patients’ “rooms and bathrooms”?

~ ~ ~

Finally, in a delightful bit of bureaucratic irony, the survey must include a long paragraph to fulfill the requirements of the “OMB Paperwork Reduction Act of 1995”. This verbiage may require the printing of an extra page, increasing the paperwork!

omb-paperwork-reduction-actIn fact, the designers of the survey violated the goals of this Act with their choice of instrument layout, which increases the physical length of the survey. Had the designers used a “table” format, somewhat as HHS did here in describing the survey, the survey’ physical length would have easily been cut in half or more and the survey could have been completed more quickly. Now that’s paperwork reduction.

But more importantly to the validity of the survey results, isn’t the language of this required statement a bit intimidating? This “collection of information” instrument must have a “valid OMB control number.” If you’re unfamiliar with OMB and bureaucratic language, this phrasing could scare people off from taking the survey. I’d be curious to know if the impact of the statement upon the survey responses has been tested.

Oh, yes. “What’s OMB?” you ask. Office of Management and Budget, of course. The acronym is never written out for the respondents, introducing yet more ambiguity, and maybe intimidation. Apparently, we’re all supposed to know the government’s acronym dictionary. How silly. How presumptuous. How damaging.

Money Grows on Trees — If You Believe the Polls

Summary: Political polls — as well as organizational surveys — many times present conflicting results within a poll. The reason is that the surveys have not been designed to force respondents to engage in trade-offs among conflicting options. We see this in the New York Times, CBS News poll of swing states released on August 23, 2012 where the poll indicates that respondents want to keep Medicare as we know it yet spend less on it. Clearly, something is amiss.

~ ~ ~

The NY Times, CBS News and Quinnipiac University released a poll of swing states (FL, OH, WI) on August 23, 2012. The key finding from the headline, “Obama Is Given Trust Over Medicare,” was summarized as:

Roughly 6 in 10 likely voters in each state want Medicare to continue providing health insurance to older Americans the way it does today; fewer than a third of those polled said Medicare should be changed in the future to a system in which the government gives the elderly fixed amounts of money to buy health insurance or Medicare insurance, as Mr. Romney has proposed. And Medicare is widely seen as a good value: about three-quarters of the likely voters in each state said the benefits of Medicare are worth the cost to taxpayers.

But here’s the question posed from the details of the survey results, which thankfully the NY Times does publish:

35. Which of these two descriptions comes closer to your view of what Medicare should look like for people who are now under 55 who would be eligible for Medicare coverage in about ten years? Medicare should continue as it is today, with the government providing seniors with health insurance, OR, Medicare should be changed to a system in which the government would provide seniors with a fixed amount of money toward purchasing private health insurance or Medicare insurance. (Answer choices rotated)

Just over 60% wanted to continue Medicare as is, and about 30% said they supported changing the system.

Now, look at the results for the next question:

36. To reduce the federal budget deficit, would you support major reductions, minor reductions, or no reductions to spending on Medicare?

 Almost 60% of respondents supported major or minor reductions in Medicare (roughly 11% Major, 48% Minor).

The Times inexplicably doesn’t report this latter finding from their survey. In fact, the headline for the article could easily have been, “Strong Majority Favor Reductions in Medicare Spending.”

But how can 60% support keeping Medicare as is yet the same percentage support spending reductions? The survey design did not force respondents to make trade-offs among competing alternatives, and these conflicting results show why forcing respondents to make trade-offs is so important. Forced trade-offs eliminate the money-grows-on-trees responses we see here. When reviewing poll findings, I frequently find such conflicting results — and only selected results are reported in the write-up.

Perhaps more puzzling is that the question as phrased is not grounded in how normal people think, that is, people who live outside of the Washington DC beltway. No one is proposing that Medicare spending should be reduced. At issue, is the rate of growth in Medicare spending. David Wessell of the Wall Street Journal in summarizing the Congressional Budget office analysis says that Ryan is proposing a Medicare be 3.5% of our Gross Domestic Product (GDP), which is the total output of our economy, in 10 years versus  4% of GDP if the program stays as is.  Currently, Medicare consumes 3.25% of GDP. With the expected growth in GDP, even under a Ryan plan Medicare spending is increasing.

Reducing spending on Medicare could be interpreted as:

  • Reducing per capita spending on each Medicare recipient
  • Reducing the overall spending on Medicare, that is, the total spent each year
  • Reducing Medicare spending as a percentage of GDP
  • and maybe some I’m not thinking of!

How did you interpret the phrasing in Question 36? Since the leading phrase in the question was “to reduce the federal budget deficit” my educated guess is that the second option above is what most people were thinking. That’s the only option that would actually “reduce” the deficit — as opposed to slowing the growth of the deficit.

Regardless, with such ambiguous phrasing, it’s near impossible to interpret the results except that 60% support some kind of reduction, a position that is incompatible with keeping Medicare “as it is today.”

My conclusion is that this phrasing shows how rooted the poll designers are in Washingtonian logic. Only in Washington is a slowing of growth rates in spending, even on a per capita basis, considered a “reduction.” Imagine the polling results if they had presented it accurately.

~ ~ ~

Another interesting element in the questionnaire design can be found in the the question immediately preceding question the Medicare change question:

34. Overall, do you think the benefits from Medicare are worth the cost of the program for taxpayers, or are they not worth the cost?

The poll found roughly consistent results for the three states with 75%-16% feeling that Medicare is worth the cost. That question helps set the mental state of the respondent that Medicare as we know it is a good thing going into the next question about making changes to the program.

We should also note that Question 35, does not present the proper choices to the respondent. Congressman Ryan’s 2011 plan did call for offering only premium support to those currently under 55 when they reach Medicare eligibility. However, the 2012 Ryan plan offers the choice of premium support or staying in traditional Medicare. In other words, the poll did not test the actual choice offered between the two campaigns even though that is how the Times has pitched the results of the poll.

Further, while the headline is that “Obama Is Given Trust Over Medicare,” the poll has mixed results. While by a  51%-42% margin Obama is trusted more to handle Medicare, more people strongly disapprove of ObamaCare than strongly approve.

Perhaps the most startling result in the poll — and not reported by the Times — was the seismic shift in the Florida senatorial race. In the Times‘ late July poll, Democrat Bill Nelson led Republican Connie Mack 47%-40% while in this poll, Mack led 50%-41%.

An Example of the Impact of Question Sequencing

Summary: The New York Times and CBS News released a nationwide poll on July 19, 2012 that conveniently ignores the impact of question sequencing and presents questionable interpretations of the data. The example shows why consumers of survey data should always examine the methodology of the survey, especially the design of the survey instrument.

~ ~ ~

In a related article I looked at some polling done by the New York Times, CBS News, and Quinnipiac University. In this article, I’ll turn to a nationwide poll that the Times and CBS News released on July 19, 2012. It shares many of the questions that the state-focused polls do, and it’s a horribly long survey at about 100 questions. My focus here is on the impact of question sequencing and how the reporters summarized and presented the findings. Again we see why you should always examine the survey instrument and the methodology of the surveyor before believing the survey’s findings — especially as presented.

About two thirds of the way through this long survey after a series of issue questions, Question 41 asked:

41. Looking back, how much do you think the economic policies of George W. Bush contributed to the nation’s economic downturn — a lot, some, not much, or not at all?

I ask you, the reader, to think about your “mental frame” as you consider that question. In other words, what are you thinking about? To achieve a valid questionnaire, every respondent should have the same interpretation of the survey questions. So, for this question to be valid we should all have similar interpretations — and the person who summarizes the results should also share that interpretation.

I think it’s fair to say that most people would be thinking about how much they blame the recession of 2008-09 on the Bush policies. That’s when the “economic downturn” occurred, and the authors of the survey have asked you, the respondent, to “look back.”

The results of that question were:

a lot — 48%
some — 33%
not much — 12%
not at all — 6%
don’t know — 2%

Here is how those results were presented in the New York Times article, which was the closing thought for the article.

Nearly half of voters say the current economic plight stems from the policies of Mr. Obama’s predecessor, George W. Bush, which most voters expect Mr. Romney would return to. (emphasis added)

Question 41 did not ask about the “current economic plight.” When you read “the nation’s economic downturn” in question 41 were you thinking of the “current economic plight?” I doubt it. (Economic growth is miserably anemic as I write this in August 2012, and the economic tea leaves are not pointing up, but currently available data do not have us in a “downturn.”) Granted, the question does not have a specific timeframe, so the authors can get away with this interpretation. I guess.

Question 42 repeated the previous question but asked about President Obama.

42. Looking back, how much do you think the economic policies of Barack Obama contributed to the nation’s economic downturn — a lot, some, not much, or not at all?

The results of that question were:

a lot — 34%
some — 30%
not much — 23%
not at all — 12%
don’t know — 1%

The reporters didn’t see fit to report these results in the article. More interesting to me as a survey designer is that Questions 41 and 42 were rotated. I would love to see the results broken out based upon which question was asked first, but the Times does not provide that detail.

Clearly, there is a sequencing effect in play.

If you were asked first about Obama’s impact on the “economic downturn,” you are certainly thinking more near term. It is doubtful that people were considering the 2008-09 recession as Obama’s fault (except maybe for those real political wonks who know of Senator Obama’s votes protecting Fannie Mae and Freddie Mac from proposed deeper regulatory oversight but even then the impact would be minimal).

So hearing the question about Obama’s impact on the “economic downturn” has set a more near-term mental frame. Now you are asked about Bush’s impact on the “economic downturn.” Are you thinking about the 2008-09 recession? Certainly not as much as if the Bush question were asked first. I think it’s fair to say that people blame Bush far less for today’s economy than the economy of 2008-09.

To summarize, I am sure the scores for questions 41 and 42 varied significantly depending upon which one was asked first. If we were only told the splits…

The proper, unbiased phrasing for the question would be,

Thinking about the current state of the economy, to what extent do you consider [Bush/Obama] to blame for the economic problems our country currently faces?

That in fact is how the writers of the article in the Times present the question, but that’s not the question that was asked. Far from it.

~ ~ ~

Now let’s look at the last phrase of the Times summary.

Nearly half of voters say the current economic plight stems from the policies of Mr. Obama’s predecessor, George W. Bush, which most voters expect Mr. Romney would return to.

According to the polling data, do “most voters expect Mr. Romney would return to” President Bush’s policies? This finding is based on question 57:

57. If elected, how closely do you think Mitt Romney would follow the economic policies of George W. Bush — very closely, somewhat closely, or not too closely or not at all closely?

The results were:

very closely — 19%
somewhat closely — 46%
not too closely — 18%
not at all closely? — 7%
don’t know — %

We can debate until the cows come home and the keg runs dry about the interpretation of “somewhat closely.” But perhaps more importantly, the survey treats “economic policies” with one broad brush. Some of those policies led to the “economic downturn,” but other policies most assuredly did not.

Further, some of the respondents who believe Mr. Romney “would return to” Bush policies may not have responded in Question 41 that they thought those policies “contributed to the economic downturn.” You cannot legitimately make the statement that the authors did linking the results of Questions 41 and 57 without segmenting the data and analyzing it properly. But they did.

~ ~ ~

Bottom line. The closing statement of the New York Times article distorts what the survey data actually said, due to sequencing effects and a convenient reinterpretation of the question. The Times is making it sound as if the polling supports the contention that voters still hold Bush responsible for the current weak economy. That may be true, but these polling data, properly analyzed, do not support that contention.

Caveat Survey Dolor: “Show Me the Questionnaire”

Summary: “Show me the car fax” is one those lines from a TV ad that frankly gets annoying after a while. My version of it is “Show me the survey instrument.” I annoy some organizations when I ask to see the survey instrument before I’ll even contemplate the findings derived from the survey. To most people, examining the instrument would seem an unnecessary annoyance. In this article I will show you why you should always go to the source and verify the validity of the data generated by the survey instrument.

In fact, I had a long string of emails with a local-to-me company that published some survey findings that got national attention. I wanted to see how they presented certain terminology to respondents that I suspected would bias how people took the survey. They declined to show me the instrument with a very lame excuse. I even told them I would help them with future survey projects in exchange for the publicity. But I guess their reasoning is: why let sound research get in the way of a good headline.

~ ~ ~

We’re in the political silly season in this summer of 2012 with polls coming out almost daily. Should you believe the summaries presented by newscasters or newspaper writers are true to the data collected? Should you believe the data collected are accurate? We see major differences across polls, so these are legitimate questions. While we can’t do a full audit of the polling processes, we can look, perhaps, at the survey instruments used.

In this article I examine a poll conducted by the New York Times, CBS News, and Quinnipiac University. Let me state right up front that I am pointing out the shortcomings of a survey done by two liberal news outlets. (Yes, my dear Pollyanna, the New York Times has a liberal bias. Shocking, I know.) I suspect if I dug into a conservative news outlet’s survey, I would find questionable distortions, though in ones I have examined, I have not seen validity issues with questions like the ones below.

On August 1 and 8, 2012 the New York Times published polls of six battleground states for the November election: Florida, Ohio, Pennsylvania, Virginia, Colorado, and Wisconsin. To their credit, the paper does provide access to the actual survey script used for the telephone survey and summary results by question. Most of the major polls make their survey language available. Those that don’t are probably hiding sloppy instrument designs — or worse.

The survey scripts appear identical for questions posed for the national level. However, they did change their definition of relevant population or sampling frame from the first batch of surveys to the second batch. For Florida, Ohio, and Pennsylvania they only report results for “likely voters;” whereas, the Virginia, Colorado, and Wisconsin surveys reported results for some questions that included registered but not likely voters and some that included non-registered respondents. See why it can be hard to do comparisons across surveys — and these surveys were done by the same organizations!

Much has been made of the fact that these pollsters oversampled democrats. (That is, the self-reported affiliation of respondents as republicans, democrats, and independents had democrats in greater proportions than in the registered voter base.) We can also look at the sequencing of questions and ask whether it creates a predisposition to answer subsequent questions a certain way. But here I want to focus on two questions that clearly show how the pollsters’ world views affected the questions they asked.

Question 19 reads as follows:

19. From what you have read or heard, does Mitt Romney have the right kind of business experience to get the economy creating jobs again or is Romney’s kind of business experience too focused on making profits?

The pollsters present the false dichotomy of business experience as focusing on either jobs or profits — a favorite theme of some. Businesses do not choose either jobs or profits. Jobs result from profitably run businesses. The question displays an incredible lack of understanding of how businesses function — or perhaps it was purposeful. In a similar vein, we have heard that corporations are sitting on a pile of cash and are “refusing” to hire people.

~ ~ ~

The next question in the survey is:

20. Which comes closest to your view of Barack Obama’s economic policies:

   1. They are improving the economy now, and will probably continue to do so,
2. They have not improved the economy yet, but will if given more time, OR
3. They are not improving the economy and probably never will.

Notice what’s missing? 1% of respondents in Florida and Colorado did. The pollsters didn’t offer choice 4. “Obama’s economic policies are hurting the economy.” 1% in Florida and Colorado apparently took the initiative to voice that option, and to the pollsters’ credit they captured it.

Isn’t it legitimate for some people to believe that the president’s economic policies are hurting the economy? Apparently, not to these pollsters. They only think that Obama’s economic policies can help the economy or be benign. Yet, rational people can certainly feel that regulations, promised tax policies, and the uncertainty of Obama’s temporary fiscal and economic policies are hurting the economy.

The pollsters only provided neutral to positive response options with no negative options. A basic requirement of a well-designed question is that it provides the respondent a reasonably balanced set of response options. This is not a mistake a seasoned survey designer would make.

Another problem with the question is that “economic policies” covers a very broad area that is open to multiple interpretations by respondents — and manipulation by the writer of the findings. The pollsters would have generated more valuable, interesting, and valid data if they had structured their question as:

Consider each of the following areas of Barack Obama’s economic policies. What impact do you feel each has had upon the economy now and in the future? Greatly helped, Helped somewhat, No impact yet but will, No impact now or in the future, Hurt somewhat, Greatly hurt.

— Policy 1
— Policy 2, etc.

~ ~ ~

Is the purpose of the polls performed by major news organizations

  1. to understand the feelings of the populace
  2. to drive those opinions or
  3. to generate data that certain, preferred candidates can use to their advantage in the campaign?

Looking at these two questions — as well as phrasing in a July 19 poll — it’s hard to say the former, which should be the goal of responsible, unbiased researchers.

In summary, these two questions show that these pollsters bring bias to their polling. Always look at the survey instrument to sense if there’s bias in the wording and fairness in interpreting the data before accepting the findings. This caveat applies to political polls as well as organizational surveys.

~ ~ ~

So why does a business hire (or layoff) someone?

A business hires someone if they feel the long-run value delivered to the organization will exceed the fully loaded cost of employing the person. It’s really that fundamental. While it’s unlikely a company can measure the direct value to the bottom line of a single employee or even a group — except perhaps for the sales force — that is what companies decide in the budgeting process. If the cost of employment exceeds the benefit, bottom line profit decreases. Why would a company hire people if the value they bring doesn’t exceed their cost?

The counterargument may be made that companies fire people to increase profits. It is true that laying off people may increase bottom-line profit, at least in the short run. (Google, not a politically conservative company at all, laid off many at its Motorola Mobile acquisition.) If the people being laid off had costs that exceeded their benefit, yes, profit will increase. But keeping people on the payroll just for the sake of “employment” can hurt those who deliver positive value to the company.

I worked for Digital Equipment Corporation in the 1980s. The company was on top of the world in 1987 when it employed more than 120,000 people worldwide. When senior management missed the changes in the competitive market, the company still resisted layoffs until the financial health of the company was threatened. In a decade Digital no longer existed with tens of thousands of job losses that greatly affected the Boston technology beltway for years to come.

More recently, look at the US car companies that employed people who literally did nothing in their “job banks.” Did that lack of focus on profit advance the bankruptcies? Most certainly.

No one’s business experience is focused on creating jobs. Entrepreneurs and business people want to build sustainable businesses by creating products and services people chose to buy. Jobs are a by-product, albeit a very important by-product, of a successful, profitable company.

Go to a thousand company websites and read their mission statements, preferably small growing companies that may not yet be profitable but are our job-creation engines. How many companies say their primary mission is to create jobs? I doubt you’ll find one.

Here’s the empirical proof that we all know. Ever heard of an established, unprofitable company that is hiring lots of people?

I recognize that was a bit of a rant, but as a business school professor this idea of “jobs versus profits” needs to be challenged for the misrepresentation that it is, and it is disturbing to find it in a survey done by a professional organization.