Categories
Presented by TRC Insights
TRC's SatiscanTM model is tested for validity using call center data and a split sample approach. This shows that SatiscanTM produces similar models when run on random halves of an energy industry dataset.
A common question asked of any analytical method is whether the results produced can be validated. What this means is whether the process by which the results are arrived at is consistent enough that the same results will be produced in similar circumstances. The usual method for testing the validity of a method is to split a given sample into two random parts and run the same analysis on both parts, to see if similar results can be obtained. If the results are similar in both parts of the data, then the method is considered to be valid.
This “split sample” approach is taken here to demonstrate the validity of SatiscanTM. Call center data from the energy industry was used for this purpose.
The dataset used here had a total of 1162 observations on 18 variables. Two random halves were created with 576 and 586 observations. Frequencies and correlation coefficients were compared to ensure that the two halves were indeed split randomly. Next, two stages of analyses were run. First, SatiscanTM analyses were run on both halves of the data. Second, stepwise regression analyses were run on both halves of the data. The regression analyses provide a benchmark for comparison, as the degree of similarity between the two regression models can be used for comparing the degree of similarity of the two SatiscanTM models.
The SatiscanTM models are given next in Figures 1 and 2 followed by the total effects tables in Table 1. Total effects tables are necessary when using SatiscanTM models since some variables can have both direct and indirect impact on the dependent variable. Total effects are calculated as the sum of direct and indirect effects.
As can be seen from Figures 1 and 2, the two SatiscanTM models are quite similar but are not exact copies of each other. The first model has more direct drivers of the dependent variable (six) than the second model (three). Further the first model also has two more key drivers (15 to 13) than the second model.
When we examine the total effects table, the most striking result is the first two key drivers. They have almost exactly the same weights in both models. The top six variables in both models are the same, although there are some differences in the ordering of the variables.
Thus in comparing the two models we could say that even though there are some differences, the basic results from both models are quite similar. Most of the dissimilar results appear in the secondary key drivers.
Next, lets take a look at the regression models presented in Table 2. In comparing the regression results with those from SatiscanTM , the most obvious difference is in the richness of the models. Because SatiscanTM is able to identify relationships between independent variables, its models have much more detail than the regression models.
Further as it was in the case of the SatiscanTM models, the first regression model has more key drivers (8) than the second model (5). Again, the variable with the most impact is the same in both cases and has almost identical impact. There is however, less commonality between the two sets of key drivers as compared to the SatiscanTM key drivers. This could possibly be because of some amount of collinearity in the data, since collinearity tends to affect regression and not SatiscanTM.
By analyzing the regression results we are able to determine that there are some inherent differences between the two samples that occurred as a result of the random split. In practice, it is often the case that when datsets are split into random halves, even though the frequencies and bivariate correlations may appear quite similar, some differences would appear in multivariate analyses. Regression analyses on the two samples helps to understand the differences between the two samples.
Taking these inherent differences into account, we could say that the SatiscanTM models are more similar than may have appeared at first sight.
Figure 1
Figure 2
This content was provided by TRC. Visit their website at www.trchome.com.
Presented by
LOCATION
Fort Washington, Pennsylvania, United States
Featured Expert
Why choose TRC Insights
On GRIT Top 50 list
In business for 35+ yrs
100's of conjoints
Team of in-house analysts
HITRUST audited data
Sign Up for
Updates
Get content that matters, written by top insights industry experts, delivered right to your inbox.