Looking at the data from various angles

Very often our clients want to know what drives a certain attitude or behavior. They will request some type of driver analysis such as correlations, regressions, Partial Least Squares, etc. For the majority of the cases, the results based on one of these techniques are sufficient. But at MSI we prefer to pause and examine the data from different angles. You never know what you may find.

We recently received a request to run a key driver analysis to identify the attributes that would have the most impact on an overall satisfaction measure. There was missing data for three of these attributes, so we imputed the answers using means.

We ran a regression and correlation to compare the results based on mean imputed data and data with missing values, and there was no difference between the two data. Also, three questions with very low effect on Satisfaction were dropped from the analysis. We used nine attributes and the non-imputed data for the analysis.

Results from the regression analysis

Support is the strongest driver of Overall Satisfaction, accounting for one-fifth of the effect. Also, looking at the beta coefficient, one would expect a one-unit change in the Support at Work attribute would result in the Overall Satisfaction to change by 0.20 units.

Regression analysis summarizes the information in very clear terms. It is easy for the clients to focus on areas of maximum impact, such as Support, Flexibility, etc. and not pay as much attention to areas that are not as influential such as Time at work. However, if we had stopped here we may have missed out on the “other” side of the story.

The regression analysis gives us the linear relationship between the predictors (attributes) and the dependent measure and also helps us identify the attributes that have the maximum effect on the dependent measure. Regression analysis fails, however, when there is a non-linear relationship between the two. In these situations a Multiple Classification Analysis (MCA) would show if there is a non-linear relationship between the predictors and the dependent measure.

Multiple Classification Analysis (MCA)

Here too we see Support and Flexibility having the maximum effect on the dependent measure as seen in the regression. But we uncovered the non-linear relationships (varying slope) from one rating point to the next. Let’s look at Flexibility; in the regression it shows a linear relationship of 0.15. In the MCA we see for the rating of 4 (Very Satisfied) the overall score is at 3.61, and when the score for the attribute drops to 3 (Somewhat Satisfied) the overall drops to 3.46, exactly as predicted by the regression. But when the score drops to 2 (Somewhat Dissatisfied) or even 1 (Very Dissatisfied) the score does not drop to 3.31 (change of 0.15), but at 3 it drops to 3.26 (change of 0.2) and remains at the same level when the score drops to 1. The implication is that once you drop below somewhat satisfied on Flexibility, you take a big hit on overall satisfaction.

The regression talks about the overall relationship between the predictor and the dependent measure, and the MCA shows the relationship at each scale point giving a granular view. A methodology will only tell one side of the story, and the challenge is to look at the data from various angles to help complete the story. When developing the story we leave no stone unturned.

Leave a Reply

Your email address will not be published. Required fields are marked *