WLS Output

Fictitious data were created by which age predicts preference. The scatterplot below, produced under Graphs, Scatterplot, shows a typical heteroscedastic (funnel-shaped) pattern. Variance in the dependent, preference, is much greater for older subjects than for younger. The general upward slope suggests regression may be appropriate, but must be corrected for heteroscedasticity using weighted least squares.

Scatter of Preference Age


Next, below, an OLS regression is run on the unweighted data. This is done simply for pedagogic purposes since by the plot above we already know WLS is needed.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Preference
/METHOD=ENTER Age
/SCATTERPLOT=(*ZRESID ,*ZPRED ) .

Regression

Notes
Output Created 25-NOV-2006 11:18:24
Comments
Input Data V:\Download\wls_2.sav
Active Dataset DataSet3
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 100
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics are based on cases with no missing values for any variable used.
Syntax REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Preference
/METHOD=ENTER Age
/SCATTERPLOT=(*ZRESID ,*ZPRED ) .
Resources Elapsed Time 0:00:00.42
Memory Required 1372 bytes
Additional Memory Required for Residual Plots 240 bytes
Processor Time 0:00:00.41

[DataSet3] V:\Download\wls_2.sav

Variables Entered/Removed(b)
Model Variables Entered Variables Removed Method
1 Age(a) . Enter
a All requested variables entered.
b Dependent Variable: Preference

Model Summary(b)
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .608(a) .370 .364 2.738
a Predictors: (Constant), Age
b Dependent Variable: Preference

ANOVA(b)
Model
Sum of Squares df Mean Square F Sig.
1 Regression 431.809 1 431.809 57.592 .000(a)
Residual 734.781 98 7.498

Total 1166.590 99


a Predictors: (Constant), Age
b Dependent Variable: Preference

Coefficients(a)
Model
Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta B Std. Error
1 (Constant) -.694 .825
-.841 .402
Age .130 .017 .608 7.589 .000
a Dependent Variable: Preference

Above, SPSS outputs the regression coefficient estimates and their standard errors.


Residuals Statistics(a)

Minimum Maximum Mean Std. Deviation N
Predicted Value 1.92 8.44 5.21 2.088 100
Residual -7.439 6.170 .000 2.724 100
Std. Predicted Value -1.577 1.546 .000 1.000 100
Std. Residual -2.717 2.253 .000 .995 100
a Dependent Variable: Preference

Charts

*zresid by *zpred Scatterplot

On these unweighted OLS data, the higher predicted values are associated with greater error as shown on the y (standardized residuals) axis.


Below, we run the Weight Estimation procedure to get the case weights to enter into the Linear Regression dialog in the WLS weights box. After this run, the needed weights are placed as as WGT_1 as an additional data sheet column, then saved when we do File, Save.
* Weight Estimation.
WLS Preference WITH Age
/SOURCE Age
/POWER -2 TO 4 BY 0.2
/CONSTANT
/SAVE WEIGHT
/PRINT BEST.

Weighted Least Squares Analysis

Notes
Output Created 25-NOV-2006 11:53:54
Comments
Input Data V:\Download\wls_2.sav
Active Dataset DataSet3
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 100
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics are based on all cases with valid data for all variables in the analysis.
Syntax WLS Preference WITH Age
/SOURCE Age
/POWER -2 TO 4 BY 0.2
/CONSTANT
/SAVE WEIGHT
/PRINT BEST.
Resources Elapsed Time 0:00:00.02
Processor Time 0:00:00.02
Time Series Settings (TSET) Amount of Output PRINT = DEFAULT
Saving New Variables NEWVAR = CURRENT
Treatment of User-Missing Values MISSING = EXCLUDE
Equations Include CONSTANT
Variables Created or Modified WGT_1 Weight for Preference from WLS, MOD_1 AGE** -3.200

[DataSet3] V:\Download\wls_2.sav

Power Summary

Log-Likelihood Values(b)
Power -2.000 -275.704
-1.800 -272.068
-1.600 -268.473
-1.400 -264.924
-1.200 -261.421
-1.000 -257.969
-.800 -254.572
-.600 -251.233
-.400 -247.959
-.200 -244.753
.000 -241.624
.200 -238.579
.400 -235.626
.600 -232.775
.800 -230.037
1.000 -227.424
1.200 -224.950
1.400 -222.631
1.600 -220.482
1.800 -218.522
2.000 -216.769
2.200 -215.245
2.400 -213.968
2.600 -212.962
2.800 -212.246
3.000 -211.841
3.200 -211.765(a)
3.400 -212.035
3.600 -212.663
3.800 -213.661
4.000 -215.035
a The corresponding power is selected for further analysis because it maximizes the log-likelihood function.
b Dependent variable: Preference, source variable: Age

Above, SPSS Weight Estimation prints out the log-likelihood values for every power between -2.0 and +4.0 in increments of .2. As the footnote indicates, a power of 3.2 is found to maximize the log-linear fit function.


Best Model Statistics

Model Description
Dependent Variable Preference
Independent Variables 1 Age
Weight Source Age
Power Value 3.200
Model: MOD_1.

Model Summary
Multiple R .719
R Square .517
Adjusted R Square .513
Std. Error of the Estimate .005
Log-likelihood Function Value -211.765

Above, the actual log-likelihood function is evaluated using age to a power of 3.2.


ANOVA

Sum of Squares df Mean Square F Sig.
Regression .003 1 .003 105.089 .000
Residual .003 98 .000

Total .005 99



Coefficients

Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta Std. Error B Std. Error
(Constant) -.671 .376

-1.787 .077
Age .129 .013 .719 .070 10.251 .000

Above, the Weight Estimation procedure prints out the parameter estimates and their standard errors after weighting. While the parameter estimates have not changed much, the standard errors are reduced, especialy for the constant. The unweighted corresponding estimates and errors are repeated below for comparison:

(Constant)       -.694             825 
Age                 .130            .017 


While the Weight Estimation procedure above assures us that WLS is working and outputs basic regression information, we can get much more complete output below by running Analyze, Regression, Linear and entering WGT_1 in the WLS weight box:
REGRESSION
/MISSING LISTWISE
/REGWGT=WGT_1
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Preference
/METHOD=ENTER Age
/SAVE PRED RESID .

Regression

Notes
Output Created 25-NOV-2006 11:56:03
Comments
Input Data V:\Download\wls_2.sav
Active Dataset DataSet3
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 100
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics are based on cases with no missing values for any variable used.
Syntax REGRESSION
/MISSING LISTWISE
/REGWGT=WGT_1
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Preference
/METHOD=ENTER Age
/SAVE PRED RESID .
Resources Elapsed Time 0:00:00.03
Memory Required 1468 bytes
Additional Memory Required for Residual Plots 0 bytes
Processor Time 0:00:00.03
Variables Created or Modified PRE_1 Unstandardized Predicted Value
RES_1 Unstandardized Residual

Note the command /SAVE PRED RESID. We clicked the Save button and asked to save unstandardized predicted values and residuals for later use.
[DataSet3] V:\Download\wls_2.sav

Variables Entered/Removed(b,c)
Model Variables Entered Variables Removed Method
1 Age(a) . Enter
a All requested variables entered.
b Dependent Variable: Preference
c Weighted Least Squares Regression - Weighted by Weight for Preference from WLS, MOD_1 AGE** -3.200

Model Summary(b,c)
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .719(a) .517 .513 .00510194
a Predictors: (Constant), Age
b Dependent Variable: Preference
c Weighted Least Squares Regression - Weighted by Weight for Preference from WLS, MOD_1 AGE** -3.200

ANOVA(b,c)
Model
Sum of Squares df Mean Square F Sig.
1 Regression .003 1 .003 105.089 .000(a)
Residual .003 98 .000

Total .005 99


a Predictors: (Constant), Age
b Dependent Variable: Preference
c Weighted Least Squares Regression - Weighted by Weight for Preference from WLS, MOD_1 AGE** -3.200
Above, the regression model as a whole is significant.

Coefficients(a,b)
Model
Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta B Std. Error
1 (Constant) -.671 .376
-1.787 .077
Age .129 .013 .719 10.251 .000
a Dependent Variable: Preference
b Weighted Least Squares Regression - Weighted by Weight for Preference from WLS, MOD_1 AGE** -3.200
Above: The same (improved) parameter estimates and errors are printed as in the Weight Estimation run earlier.

Residuals Statistics(b,c)

Minimum Maximum Mean Std. Deviation N
Predicted Value 1.9047227 8.3447857 5.1569543 2.06182603 100
Residual -7.34478521 6.23123980 .05304569 2.72447131 100
Std. Predicted Value(a) . . . . 0
Std. Residual(a) . . . . 0
a Not computed for Weighted Least Squares regression.
b Dependent Variable: Preference
c Weighted Least Squares Regression - Weighted by Weight for Preference from WLS, MOD_1 AGE** -3.200

GRAPH
/SCATTERPLOT(BIVAR)=PRE_2 WITH RES_2
/MISSING=LISTWISE .

Graph

Notes
Output Created 25-NOV-2006 11:58:20
Comments
Input Data V:\Download\wls_2.sav
Active Dataset DataSet3
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 100
Syntax GRAPH
/SCATTERPLOT(BIVAR)=PRE_2 WITH RES_2
/MISSING=LISTWISE .
Resources Elapsed Time 0:00:00.45
Processor Time 0:00:00.44

[DataSet3] V:\Download\wls_2.sav

Scatter of RES_1 PRE_1

Above, for instructional purposes, Graph, Scatterplot is run to get the unweighted predicted by residuals plot, showing again the heteroscedastic funnel-like pattern of errors prior to applying WLS.



GRAPH
/SCATTERPLOT(BIVAR)=PRED WITH RESID
/MISSING=LISTWISE .

Graph

Notes
Output Created 25-NOV-2006 12:02:31
Comments
Input Data V:\Download\wls_2.sav
Active Dataset DataSet3
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 100
Syntax GRAPH
/SCATTERPLOT(BIVAR)=PRED WITH RESID
/MISSING=LISTWISE .
Resources Elapsed Time 0:00:00.44
Processor Time 0:00:00.42

[DataSet3] V:\Download\wls_2.sav

Scatter of RESID PRED
Finally, a plot of weighted predictions by weighted residuals is run, showing a much more random and homogenous pattern of error. Put another way, OLS regression has been run without violating the assumption of homoscedasticity because we transformed the data using appropriate weights calculated using the Weight Estimation procedure.





@c 2006, 2007 G. David Garson