Collecting customer feedback in the form of surveys is the bread and butter of every company. But often the analysis of it leaves a lot to be desired. From not preparing data correctly to jumping to conclusions based on statistically insignificant data, a lot can go wrong. Thankfully, there’s lots you can do to make a big jump in the quality of your survey analysis.
This post shows you how to improve your survey analysis in four ways:
1) Ask the right (open-ended) questions to get valid and useful data
2) Properly prepare your data for analysis
3) Dig further into your data with techniques like Sentiment Analysis or Driver Analysis
4) Visualize your data for even more clarity
Ask Open-ended Questions
For decades, market researchers have faced a quandary. On the one hand, it’s important to maximise the response and completion rates of your survey and to ask as many questions as possible. On the other hand, customers and users (specifically, 80% of them) will jump ship if your survey is too long and complicated. The two can be a pretty tough balancing act.
That’s why so many researchers have turned to rely almost exclusively on the Net Promoter Score. However, questions that involve the likelihood to revisit, repeat purchases, or even product reuse can be just as good (if not better) predictors of customer loyalty depending on your product or industry. So, what do you do?
Ladder your questions to build a more comprehensive picture.
It’s simple – do not under any circumstances rely on asking only a single closed-question that delivers a numeric score. It sounds obvious, right? But you’d be surprised at how many companies build their insights based on a single number like their NPS.
After all, one-question surveys like the NPS are tempting. They’re easy to ask, even easier to calculate a score from and they have high completion rate. For frontline managers, they deliver a metric that they can act on fast, rather than waiting months for customer feedback that is out-dated by the time it gets to them. But to rely on a single metric would be a mistake.
Instead, always ladder your questions to build context around a score. Laddering questions accomplishes a neat little psychological trick – by asking the respondent to elaborate on a question they already answered, they don’t seem as burdensome as being tasked with two separate questions.
It can be as easy as asking a second closed question that dives a bit deeper into specifics. Consider the following example based on a fictional telecommunications company, QBC:
1) How would you rate your satisfaction with QBC?
2) How would you rate the following from QBC:
– Phone reception;
– Internet connection and speed;
– Cost;
– Customer service.
This would give you the data to help determine which aspects of QBC’s service influences overall satisfaction.
However, even more important is asking an open-ended question after a closed question to provide context for any survey question that produces a rating or a score. The purpose is to determine the “why” behind the score.
For example, if a customer gives an NPS of 6, you may ask as your follow-up, “What prompted you to give us a 6?” or “What is the most important reason for your score?” There’s also always the underrated “What else would you like us to know?” to finish off a survey.
Once you’ve supplemented your initial score with follow-up answers, you can clean and code the data to make your survey analysis far more powerful.
Prepare your data for analysis
One of the most important components of data processing is quality assurance. Without having clean data, your results will be invalid.
The data cleaning process has two main components: data cleaning and data coding.
Cleaning messy data involves:
– Identifying outliers;
– Deleting duplicate records;
– Identifying contradictory, invalid, or dodgy responses.
Two types of respondents often mess up your data: speedsters and flatliners. They can be especially problematic when there is a reward for completing your survey.
Speedsters. Speedsters are respondents who complete the survey in a fraction of the time it should have taken them. Therefore, it’s highly unlikely that they read or answered the questions properly. Identifying speedsters is a relatively simple affair. Set a time that would expect a respondent to complete the survey or even a section of it, and then remove and respondents that deviate significantly from that time. Nowadays, it is generally an industry standard to remove any respondents that complete the survey in less than one-third the median time.
Flatliners. Flatlining, sometimes called straight-lining, happens when a respondent picks the exact same answer for each item in a series of ratings or grid questions. A series of rating question looks like, “On a scale of 1–5, where 1 means ‘not satisfied’ and 5 means ‘extremely satisfied,’ how would you rate each of the following options?” Flatliners mess up survey data by marking the same response for every single answer.
Including contradictory questions can help “catch out” flatliners in your surveys.
You may want to design your survey to try to catch flatliners. You can do so by asking contradictory questions or including a form of front/back validation, which asks the same question twice but reverses the order of the options.
For example, you may ask respondents to select their age bracket twice but place the age brackets in a different order for consecutive questions. Doing so, however, will make your survey longer and more confusing to actually attentive respondents, so beware of the risk to your completion rates.
Questions like the one above can filter out inattentive survey takers, but they may lower completion rates.
Removing flatliners can be a judgement call. How important was the series of questions they flatlined on? Did it determine what other questions they were asked? If they flatline for three or four consecutive responses, are their other responses still valid?
Practical considerations can help you decide whether to keep or remove respondents. For example, if you intend to survey 1000 people but end up 50 over quota, you can be more aggressive in removing respondents.
There are other quality-control measures you can implement in your survey design or check during data cleaning.
- Make open-ended questions mandatory. If a respondent provides gibberish answers (random letters or numbers, etc.), you can review their other answers to decide whether to remove them.
- Put in a red-herring question to catch speedsters, flatliners, and other inattentive respondents. Include obviously fake brands or fake products in an answer list. (One note of caution: Make sure that these fake items don’t have similar spellings to existing brands.) Flag respondents who select two or more fake items.
Once you have clean data, you can start coding; manually for smaller datasets, programmatically for large ones.
Coding open-ended text data
We’ve established that you want to be asking open-ended questions. But asking open-ended questions means dealing with open-ended answers. The traditional method of dealing with open-ended feedback is to code it manually. This involves reading through some of the responses (e.g. 200 randomly selected responses) and using your own (subjective!) judgment to identify categories.
For example, if a question asks about attitudes toward a brand’s image, some of the categories may be:
1) Fun;
2) Value for money;
3) Innovative, etc.
This list of categories and the numerical codes assigned to them is known as a code frame. After you’ve created your code frame, you’ll need to manually read each response and match them to an assigned value.
For example, if someone said, “I think the brand is really fun,” that response would be assigned a code of 1 (“Fun”). The responses can be assigned one value (single response) or multiple values (multiple responses).
You’ll end up with a dataset that looks like the following:
If your dataset is too big to code manually, there’s another option: sentiment analysis.
Sentiment analysis to code big datasets
Sentiment analysis is a method of text analytics and coding open-ended feedback that can be done both manually and computationally. In the case of sentiment analysis computationally, an algorithm automatically counts the number of positive or negative words that appear in a response and then subtracts the negative from the positive to generate a sentiment score:
In the above examples of sentiment analysis, the top response would generate +2, while the bottom would generate -2.
While sentiment analysis seems simple and has some advantages, it also has limitations. There will always be a degree of noise and error. Sentiment analysis algorithms struggle with sarcasm and often are poor interpreters of meaning (for now, at least).
For example, it’s difficult to train a computer to correctly interpret a response like “I love Apple. NOT!” as negative. But if the alternative is trawling through thousands of responses, the trade-off is obvious.
Not only is sentiment analysis much faster than manual coding; it’s cheaper, too. It also means that you can quickly identify and filter for responses with extreme sentiment scores (e.g. -5) to better understand why someone had a strong reaction.
There are a few keys to make sentiment analysis coding work:
- Ask shorter, direct questions to solicit an emotion or opinion without leading respondents. For example, replacing “What do you think about Apple?” with “How do you feel about Apple?” or “When you think about Apple, what words come up?” The latter questions are less likely to generate ambivalent, tough-to-interpret responses.
- Avoid using sentiment analysis on follow-up questions. For example, responses to “Why did you give us a low score?” are not suitable for sentiment analysis because asking people who you know already have a negative attitude towards your brand will naturally skew your results.
- Be wary of multiple opinions in a single question. Answering multiple questions will skew your results toward the middle ground, which is why shorter and more direct questions about a single aspect are better.
Before proceeding to statistical analysis, you need to summarize your cleaned and coded data.
Summarize your survey data for analysis
NPS. Calculating your NPS is simple. Divide your respondents into three groups based on their score: 9–10 are promoters, 7–8 are passives, 0–6 are detractors. Then, use the following formula:
% Promoters – % Detractors = NPS
One additional note: NPS reduces an 11-point scale to a 3-point scale: Detractors, Passives, and Promoters. This can limit your ability to run stats testing in most software.
The solution is to recode your raw data. Turn your detractor values (0–6) into -100, your passives (7–8) into 0, and promoters (9–10) into 100.
You can manually recode and calculate your NPS in Excel using an IF formula:
Customer satisfaction score. Generally, for a customer satisfaction score, you ask customers a question like: “How would you rate your overall satisfaction with the service you received?” Respondents rate their satisfaction on a scale from 1 to 5, with 5 being “very satisfied.”
To calculate the percentage of satisfied customers, the formula is:
(customers who rated 4–5 / responses) x 100 = % satisfied customers
Conduct a driver analysis
So, you’ve asked the right questions, gathered your data and cleaned it up. You’ve even calculated your NPS or Customer Satisfaction etc., score. Is that enough? Of course not!
You need to know the why behind the score and what to do about it. Enter driver analysis. Alternatively known as key driver analysis, importance analysis, and relative importance analysis, driver analysis quantifies the importance of predictor variables in predicting an outcome variable. More plainly, it tells you what were the important factors that ultimately resulted in your score.
Driver analysis is helpful for answering questions like:
- “Do our customers’ care more about affordability or convenience?”
- “Should we focus on reducing prices or improving the quality of our products?”
- “Should we focus our positioning on being innovative or reliable?”
This component of survey analysis consists of five steps.
Step 1: Stack your data
To conduct a driver analysis, stack your data. A stacked data format looks like the table below:
In the above example, the first column is your quantitative metric (e.g. NPS), while the second, third, and fourth columns are the coded responses to your open-ended follow-up questions.
Step 2: Choose your regression model
There are several types of regression models, the most common being linear regression and logistic regression. Most studies use a linear regression model, but the decision really depends on your data.
Linear regression is best used when your outcome variable is continuous or numeric (e.g. NPS). Logistic regression should be used when your outcome variable is binary (e.g. has two categories like “Do you prefer coffee or tea?”). You can find a more detailed explanation in this eBook.
I’m going to show an example of regression analysis below. If you prefer to learn from an interactive tutorial, you can do so here.
Step 3: Run your data through a statistics tool
We completed a survey that asked respondents the following questions:
1) How likely are you to recommend Apple? (a standard NPS question)
2) What is the most important reason for your score? (open-ended response)
We wanted to see the key brand elements that caused respondents to recommend Apple.
We coded the free-text answer to question two, which gave us the dataset below:
We loaded this into our analysis software (Displayr) and ran a logistic regression to see which of the brand attributes were most important in determining the NPS score.
(Most standard software packages like Stata, SAS, or SPSS offer logistic regression. There are also some handy YouTube videos, like this one by Data Analysis Videos.)
The further the t value is from 0, the stronger the predictor variable (e.g. “Fun”) is for the outcome variable (NPS). In this example, “Fun” and “Worth what you pay for” have the greatest influence on NPS.
Since our estimates for “Fun” and “Worth what you pay for” are fairly close together, we ran a relative importance analysis to be sure of our results:
This analysis hedges against a possible shortcoming of logistic regression, collinearity. If your estimates are fairly close together it may because collinearity has inflated the variance of one of them to make them appear that way.
Step 4: Test for significance
How do you know when (quantitative) changes in your customer satisfaction feedback are significant? You need to separate the signal from the noise.
Statistical significance testing is automated and built-in to most statistical packages. If you need to work out your statistical significance manually, read this guide.
Step 5: Visualize your data
Finally, we can visualize the results to share with colleagues and clients. The goal is to enable them to understand the results without them having to manually sort through thousands of responses looking for insights.
One way to visualize customer feedback data, especially NPS, is to plot the frequency or percentage of the ratings. For example, if you want to show the distribution of ratings within each promoter group, you can use a bar pictograph visualization.
We’ve colour coded the pictograph bar chart below to make it easier to distinguish between groups. It clues us to an important observation. Among detractors, there is a far bigger concentration of scores 5–6 than 0–4.
These 5–6 detractors are much more likely to be swayed than those who gave a score closer to 0. With that in mind, you could focus your analysis of qualitative responses on that group.
Comparison visualizations. You can also compare the NPS of different brands to benchmark against competitors. The stacked bar chart below shows responses by promoter group for different tech companies.
While we lose some detail by aggregating ratings by promoter group, we gain the ability to compare different brands efficiently:
At a glance, it’s easy to spot that Google has the most promoters and fewest detractors. IBM, Intel, HP, Dell, and Yahoo all struggle with a high number of detractors and few promoters.
This visualization also shows the size of the passive group relative to the other two. (A common mistake is to concentrate solely on promoters or detractors.)
If the size of your passive group is large relative to your promoters and detractors, you can dramatically improve your NPS by nudging them over the line. (There’s also the risk, of course, that they could swing the other way.).
Visualizations over time. You can track your NPS over time with a column or line chart (if you have enough responses over a long period of time). These charts can help show the impact of marketing campaigns or product changes.
Be careful, though. If your NPS fluctuates (perhaps due to seasonal sales or campaigns), it can be difficult to spot a trend. A trendline can indicate the overall trajectory.
For example, in the chart below, it would be difficult to spot which way the NPS is trending without the help of a trend line:
Conclusion
When it comes to improving survey analysis, a lot comes down to how well you design your survey and prepare your data.
No single metric, NPS included, is perfect. Open-ended follow-up questions after closed questions can help provide context.
There are a few other keys to successful survey analysis:
- Ladder your questions to get context for ratings and scores.
- Design your survey well to avoid headaches when preparing and analysing data.
- Clean your data before analysis.
- Use driver analysis to get that “Aha!” moment.
- Manually code open-ended questions for small surveys; use sentiment analysis if you have thousands of responses.
- Test for significance.
- Visualize your data to help it resonate within your organization.