General Linear Model (GLM)

Model Summary - S, R-Sq, R-Sq (adj) and R-Sq (pred) Values

  

S, Rimage\squared.gif, adjusted Rimage\squared.gif, and predicted Rimage\squared.gif are measures of how well the model fits the data. These values can help you select the model with the best fit.

·    S is measured in the units of the response variable and represents the standard distance that data values fall from the regression line. For a given study, the better the equation predicts the response, the lower S is.

·    Rimage\squared.gif (R-Sq) describes the amount of variation in the observed response values that is explained by the predictor(s). Rimage\squared.gif always increases with additional predictors. For example, the best five-predictor model will always have a higher Rimage\squared.gif than the best four-predictor model. Therefore, Rimage\squared.gif is most useful when comparing models of the same size.

·    Adjusted Rimage\squared.gif is a modified Rimage\squared.gif that has been adjusted for the number of terms in the model. If you include unnecessary terms, Rimage\squared.gif can be artificially high. Unlike Rimage\squared.gif, adjusted Rimage\squared.gif may get smaller when you add terms to the model. Use adjusted Rimage\squared.gif to compare models with different numbers of predictors.

·    R2(pred) is a measure of how well the model predicts the response for new observations. Large differences between Predicted R2 and the other two R2 statistics can indicate that the model is overfit. An overfit model does not predict new observations nearly as well as the model fits the existing data. Predicted R2 is more useful than adjusted R2 for comparing models because it is calculated with observations not included in the model calculation.

Example Output

       S    R-sq  R-sq(adj)  R-sq(pred)

0.147504  94.61%     92.81%      88.01%

Interpretation

For the salary data, S is 0.147504, Rimage\squared.gif is 94.61%, and adjusted Rimage\squared.gif equals 92.81%. Rimage\squared.gif (pred) is 88.01%, which indicates that the model explains 88.01% of the variation in Salary when you use it for prediction. If you are comparing different salary models, then you generally look for models that minimize S and maximize the Rimage\squared.gif values.