|
|
Apart from its main function in correcting for heteroscedasticity, WLS regression is sometimes also used to adjust fit to give less weight to distant points and outliers, or to give less weight to observations thought to be less reliable.
SPSS: WLS in SPSS is a two-step process. There is a "Weight Estimation" module (Analyze, Regression, Weight Estimation) to calculate and save the optimal weights using the Save button. Then the OLS regression is run on the weighted cases (Analyze, Regression, Linear). Note the weights are entered in the "WLS weights" box of the linear regression dialog, not under Data, Weight Cases (which is ignored). Note also that the "Weight Estimation" routine will also generate the estimated regression coefficients and their standard errors, but by running the OLS regression on weighted cases, fuller output is obtained, including, for instance, residuals useful for analyzing patterns of error.
That is, the y-dimension spread of points may increase (or decrease), often exponentially, as x increases linearly. It may simultaneously be true that the x-dimension spread of points may increase (or decrease), often exponentially, as y increases linearly. In the figure above, the variance in Preference increases for higher values of Age.
In more unusual cases, violation of homoscedasticity may display a "u-shaped" distribution (narrow in the middle, expanding to a funnel at both ends), or other non-homogenous distributions. A violation of homoscedasticity can be seen after running OLS regression: a plot of standardized predicted values on the x axis and studentized residuals on the y axis will also show a "funnel" or other heteroscedastic shape rather than the desired random cloud of residual (error) values all along the x axis. I


In this case the value which maximizes the likelihood function is 3.0. In the annotated output below, a finer set of steps reveals that 3.2 is the best value. In this run, however, 3.0 is used as the power value to generate the WGT_1 values, which become added to the dataset as illustrated below:


After saving, use Transform, Compute, to create weighted predicted and weighted residual values: ex., WTDRESID = SQRT(WGT_1) * RES_1; and WTDPRED = SQRT(WGT_1) * PRE_1. That is, the unstandardized residual and predicted variables are multiplied by the square root of the WLS weights to get their weighted equivalents. (Of course, if you have run the regression a second time, you will be dealing with RES_2 and PRE_2, etc.)

In the figure above, running linear regression with the Save option gives the PRE_1 and RES_1 columns for unstandardized residual and predicted values, and the Transform, Compute operations create the WTDRESID and WTDPRED columns.
As a final step, one uses Graph, Scatterplot to plot WTDPRED on the x axis and WTDRESID on the y axis, illustrated below:

Comparing this chart with the initial one above, it can be seen that the funnel shape has disappeared. Because of the reduction in heteroscedasticity, standard errors will be small but estimates will be very similar. The table below shows the estimated b coefficient for age and its standard error, with output above for OLS and below for WLS:
