[Home]  [Syllabus]  [Statnotes]  [Links]  [Lab]  [Instructor]  [Home]

Discriminant Function Analysis (Two Groups): SPSS Output

Notes This example is from the SPSS 7.5 "Applications Guide" example for file "gss 93 subset.sav". The dependent is "vote92." The independents are age, educ, income91, sex, and polviews (which is a 7-point Likert scale from "Extremely liberal" to "extremely conservative").

To obtain this output:

  1. File, Open, point to gss 93 subset.sav.
  2. Restrict vote92 to 1's and 2's by choosing Data, Select Cases, "If condition is satisfied". Click the If button and enter vote92 <3. Click Continue, OK.
  3. Statistics, Classify, Discriminant
  4. Select vote92 as the "grouping variable" (the dependent). As independents, select age, educ, income91, sex, and polviews. Check "Enter independents together" (i.e., not stepwise).
  5. Click on Statistics and check all Descriptives and all Function Coefficients.
  6. Click on Classify and check Results (limit to first 10), Summary Table, and all plots.
  7. To run, click OK.
Comments in blue are by the instructor and are not part of SPSS output.


Discriminant

First come several blocks of general processing and descriptive statistics information.
Notes

Output Created
02 Mar 98 14:11:35

Comments



Input
Data
Y:\PC\spss95\GSS93 subset.sav

Filter
vote92 < 3 (FILTER)

Weight
<none>

Split File
<none>

N of Rows in Working Data File
1452

Missing Value Handling
Definition of Missing
User-defined missing values are treated as missing in the analysis phase.

Cases Used
In the analysis phase, cases with no user- or system-missing values for any predictor variable are used. Cases with user-, system-missing, or out-of-range values for the grouping variable are always excluded.

Syntax
DISCRIMINANT
/GROUPS=vote92(1 2)
/VARIABLES=sex age educ income91 polviews
/ANALYSIS ALL
/PRIORS EQUAL
/STATISTICS=MEAN STDDEV UNIVF BOXM COEFF RAW TABLE
/PLOT=COMBINED SEPARATE MAP
/PLOT=CASES(10)
/CLASSIFY=NONMISSING POOLED .


Resources
Elapsed Time
0:00:01.21




Analysis Case Processing Summary

Unweighted Cases
N
Percent

Valid
1345
92.6

Excluded
Missing or out-of-range group codes
0
.0

At least one missing discriminating variable
107
7.4

Both missing or out-of-range group codes and at least one missing discriminating variable
0
.0

Total
107
7.4

Total
1452
100.0




Group Statistics


Mean
Std. Deviation
Valid N (listwise)

Voting in 1992 Election




Unweighted
Weighted

voted
Respondent's Sex
1.55
.50
971
971.000

Age of Respondent
47.56
16.73
971
971.000

Highest Year of School Completed
13.64
2.97
971
971.000

Total family Income
15.51
5.00
971
971.000

Think of Self as Liberal or Conservative
4.19
1.41
971
971.000

did not vote
Respondent's Sex
1.57
.50
374
374.000

Age of Respondent
41.64
17.34
374
374.000

Highest Year of School Completed
11.84
2.84
374
374.000

Total Family Income
12.60
5.77
374
374.000

Think of Self as Liberal or Conservative
4.10
1.21
374
374.000

Total
Respondent's Sex
1.56
.50
1345
1345.000

Age of Respondent
45.91
17.10
1345
1345.000

Highest Year of School Completed
13.14
3.04
1345
1345.000

Total family Income
14.70
5.39
1345
1345.000

Think of Self as Liberal or Conservative
4.17
1.36
1345
1345.000


In the ANOVA table below, the smaller the Wilks's lambda, the more important the independent variable to the discriminant function. Wilks's lambda is significant by the F test for age, educ, and income92. We might consider dropping sex and polviews from the model.
Tests of Equality of Group Means



Wilks' Lambda
F
df1
df2
Sig.

Respondent's Sex
1.000
.418
1
1343
.518

Age of Respondent
.976
33.197
1
1343
.000

Highest Year of School Completed
.930
101.620
1
1343
.000

Total family Income
.941
83.840
1
1343
.000

Think of Self as Liberal or Conservative
.999
1.423
1
1343
.233



Analysis 1

Box's Test of Equality of Covariance Matrices

The larger the log determinant in the table below, the more that group's covariance matrix differs. The "Rank" column indicates the number of independent variables -- 5 in this case. Since discriminant analysis assumes homogeneity of covariance matrices between groups, we would like to see the determinants be relatively equal. Box's M, next, tests the homogeneity of covariances assumption.


Log Determinants

Voting in 1992 Election
Rank
Log Determinant

voted
5
10.006

did not vote
5
9.945

Pooled within-groups
5
10.019

The ranks and natural logarithms of determinants printed are those of the group covariance matrices.


Test Results

Box's M test tests the assumption of homogeneity of covariance matrices. This test is very sensitive to meeting also the assumption of multivariate normality. Discriminant function analysis is robust even when the homogeneity of variances assumption is not met, provided the data do not contain important outliers. For the data below, the test is significant so we conclude the groups do differ in their covariance matrices, violating an assumption of DA. Note that when n is large, as it is here, small deviations from homogeneity will be found significant, which is why Box's M must be interpreted in conjunction with inspection of the log determinants, above.
Box's M
40.399

F
Approx.
2.679

df1
15

df2
2102169.732

Sig.
.000

Tests null hypothesis of equal population covariance matrices.


Summary of Canonical Discriminant Functions

The table below shows the eigenvalues. The larger the eigenvalue, the more of the variance in the dependent variable is explained by that function. Since the dependent in this example has only two categories, there is only one discriminant function. However, if there were more categories, we would have multiple discriminant functions and this table would list them in descending order of importance. The second column lists the percent of variance explained by each function. The third column is the cumulative percent of variance explained. The last column is the canonical correlation, where the squared canonical correlation is the percent of variation in the dependent discriminated by the independents in DA. Sometimes this table is used to decide how many functions are important (ex., eigenvalues over 1, percent of variance more than 5%, cumularive percentage of 75%, canonical correlation of .6). This issue does not arise here since there is only one discriminant function, though we may note its canonical correlation is not high.


Eigenvalues

Function
Eigenvalue
% of Variance
Cumulative %
Canonical Correlation

1
.164(a)
100.0
100.0
.376

a First 1 canonical discriminant functions were used in the analysis.




This second appearance of Wilks's lambda serves a different purpose than its use in the ANOVA table above. In the table below it tests the significance of the eigenvalue for each discriminant function. In this example there is only one, and it is significant.
Wilks' Lambda
Test of Function(s)
Wilks' Lambda
Chi-square
df
Sig.

1
.859
203.909
5
.000



The standardized discriminant function coefficients in the table below serve the same purpose as beta weights in multiple regression: they indicate the relative importance of the independent variables in predicting the dependent.

Standardized Canonical Discriminant Function Coefficients



Function

1

Respondent's Sex
.011

Age of Respondent
.657

Highest Year of School Completed
.712

Total family Income
.423

Think of Self as Liberal or Conservative
.018


The structure matrix table below shows the correlations of each variable with each discriminant function. In this case, there is only one discriminant function. However, when the dependent has more categories there will be more discriminant functions. In that case, there will be additional columns in the table, one for each function. The correlations then serve like factor loadings in factor analysis -- that is, by identifying the largest absolute correlations associated with each discriminant function the researcher gains insight into how to name each function.


Structure Matrix


Function

1

Highest Year of School Completed
.679

Total family Income
.616

Age of Respondent
.388

Think of Self as Liberal or Conservative
.080

Respondent's Sex
-.044

Pooled within-groups correlations between discriminating variables and standardized canonical discriminant functions
Variables ordered by absolute size of correlation within function.




The table below contains the unstandardized discriminant function coefficients. These would be used like unstandardized b (regression) coefficients in multiple regression -- that is, they are used to construct the actual prediciton equation which can be used to classify new cases.

Canonical Discriminant Function Coefficients



Function

1

Respondent's Sex
.021

Age of Respondent
.039

Highest Year of School Completed
.243

Total family Income
.081

Think of Self as Liberal or Conservative
.013

(Constant)
-6.253

Unstandardized coefficients


The table below is used to establish the cutting point for classifying cases. If the two groups are of equal size, the best cutting point is half way between the values of the functions at group centroids (that is, the average). If the groups are unequal, the optimal cutting point is the weighted average of the two values. Cases which evaluate on the function above the cutting point are classified as "did not vote," while those evaluating below the cutting point are evaluated as "Voted." Of course, the computer does the classification automatically, so these values are for informational purposes.

Functions at Group Centroids



Function

Voting in 1992 Election
1

voted
.251

did not vote
-.653

Unstandardized canonical discriminant functions evaluated at group means

Classification Statistics

The table below just tells the researcher about the status of cases in terms of processing.
Classification Processing Summary
Processed
1452

Excluded
Missing or out-of-range group codes
0

At least one missing discriminating variable
107

Used in Output
1345

Prior Probabilities below are used in classification. The default is using observed group sizes (marginals) in your sample to determine the prior probabilities of membership in the groups formed by the dependent, and this is necessary if you have different group sizes. If each group is of the same size, as an alternative you could specify equal prior probabilities for all groups.
Prior Probabilities for Groups


Prior
Cases Used in Analysis

Voting in 1992 Election


Unweighted
Weighted

voted
.500
971
971.000

did not vote
.500
374
374.000

Total
1.000
1345
1345.000



The table below is the result of checking "Fisher's" under "Function Coefficients" in the "Statistics" option of discriminant analysis. Two sets (one for each dependent group) of unstandardized linear discriminant coefficients are calculated, which can be used to classify cases. This is the classical method of classification, though now little used.
Classification Function Coefficients


Voting in 1992 Election

voted
did not vote

Respondent's Sex
7.048
7.029

Age of Respondent
.250
.215

Highest Year of School Completed
1.884
1.664

Total family Income
.319
.246

Think of Self as Liberal or Conservative
2.187
2.175

(Constant)
-32.013
-26.541

Fisher's linear discriminant functions


The table below results from checking "Casewise results" in the "Classify" options of discriminant function analysis. The table lists the actual group, the predicted group based on largest posterior probabilities, the prior probability (the probability of the observed group score given membership in the predicted group), the posterior probability (the chance the case belongs to the predicted group, based on the independents), the Mahalanobis distance squared of the case to the group centroid (large scores indicate outliers), and the discriminant score for the case. The case is classified based on the discriminant score in relation to the cutoff (not shown). Misclassified cases are marked with asterisks. The "Second Highest Group" columns show the posterior probabilities and Mahalanobis distances for the case had the case been classed based on the second highest posterior probability. Since there are only two groups in this example, the "second highest" is equivalent to the "other" group.
Casewise Statistics



Actual Group
Highest Group
Second Highest Group
Discriminant Scores

Predicted Group
P(D>d | G=g)
P(G=g | D=d)
Squared Mahalanobis Distance to Centroid
Group
P(G=g | D=d)
Squared Mahalanobis Distance to Centroid
Function 1


Case Number




p
df













Original
1
1
2(**)
.774
1
.537
.082
1
.463
.381
-.366

2
1
1
.561
1
.718
.339
2
.282
2.208
.833

3
1
1
.528
1
.727
.399
2
.273
2.358
.883

4
1
1
.445
1
.750
.583
2
.250
2.780
1.015

5
1
1
.015
1
.932
5.941
2
.068
11.165
2.689

6
1
1
.622
1
.701
.243
2
.299
1.951
.744

7
1
1
.878
1
.634
.024
2
.366
1.119
.405

8
2
1(**)
.835
1
.645
.044
2
.355
1.238
.460

9
1
1
.390
1
.766
.738
2
.234
3.107
1.110

10
1
1
.430
1
.755
.624
2
.245
2.870
1.041

** Misclassified case


Separate-Groups Graphs

The tables below result from checking "Combined-groups" and "Separate-groups" under "Plots" in the "Classify" options of discriminant analysis. If there were two or more discriminant functions, the charts below would be scatterplots showing the relation of the first two discriminant functions. As the dependent in this example has only one discriminant function, bar charts are displayed instead. In a good discriminant function, the bar chart will have most cases near the mean, with small tails.


Voting in 1992 election =  voted

Voting in 1992 election =  did not vote

The table below is used to assess how well the discriminant function works, and if it works equally well for each group of the dependent variable. Here it correctly classifies about two-thirds of the cases, making about the same proportion of mistakes for both categories. This would normally not be considered a satisfactory level of discrimination and the researcher would seek to test other models.
Classification Results(a)



Predicted Group Membership
Total


Voting in 1992 Election
voted
did not vote



Original
Count
voted
649
322
971

did not vote
121
253
374

%
voted
66.8
33.2
100.0

did not vote
32.4
67.6
100.0

a 67.1% of original grouped cases correctly classified.