Principles and Practice of Structural Equation Modeling

Rex B. Kline

 

NY: Guilford Press, 1998.

ISBN 1-57230-337-9 (pbk.). List $35.00.

 

 

Chapter 2: Basic Statistical Concepts

 

[pp. 15-30 assigned for Week 3, Correlation and Partial Correlation]

 

1. What are exogenous and endogenous variables?

Loosely, exogenous variables are independents and endogenous ones are the dependents in a model. The exogenous variables in a model have no incoming arrows except error. Endogenous variables do have arrows incoming from other variables in the model.

 

2. What is standardization and when, in general, does one prefer standardization?

Data are standardized by subtracting the mean and dividing by the standard deviation, making all variables comparable because all wind up with means of 0 and standard deviations of 1. Standardized statistical coefficients are ones computed on standardized data or use algorithms which have the same effect.

 

You want standardization when you want variables to be comparable. Let the relation of income and conservatism be equally strong as the relation of education and conservatism. A standardized measure such as correlation will yield the same magnitude coefficient for both relationships. An unstandardized measure, such as covariance, will not. Covariance will be larger for the relation involving a larger metric (income) than for one measured in small units (education). Covariance will also be larger for the variables with the larger standard deviations.

 

You do not want standardization, however, when you need to compare means and variances since, by definition, standardization gets rid of differences on these bases. ANOVA and SEM are examples of procedures which want to look at patterns of covariances and thus need unstandardized data.                

 

3. What is the difference between Pearsonian and point-biserial correlation?

Pearson’s r is for the correlation of two interval variables. Point-biserial r is for the correlation of an interval with a dichotomy.

 

SPSS computes point-biserial automatically, using an exact method. Decades ago, in the age of manual computation, there were separate formulas for approximating point-biserial correlation and for Pearsonian correlation.

 

4. Is a criterion variable an independent or dependent?

Dependent. The independent in this vocabulary is the predictor variable.

 

5. If y is the dependent and x is the independent, do we speak of “the regression of x on y” or “the regression of y on x”?

The latter. The “on” variable is the independent or predictor. It is short for “y predicted on the basis of knowing x”.

 

6. What coefficient is the slope of the straight line which best summarizes a pattern of dots on a scattergraph?

The unstandardized regression coefficient, which is the change in y for each unit change in x, the predictor.

 

7. Why is ordinary least-squares regression called that?

Because it is the most common type of regression, using as a criterion for drawing the regression line the line that minimizes the sum of squared distances of the points in a scatterplot to the regression line.

 

8. What is the assumption of independent error in correlation and regression, and what difference does its violation make?

Residuals (deviation of observed and predicted values) of x should be random. Particularly when one has time series data, there is a possibility that error at time 1 will be a large determinant of error at time 2. This is called autocorrelation.

 

To the extent this assumption is violated, the reliability of associated measures of significance of the r or b coefficients will be compromised. This is a non-trivial problem. If one has reason to suspect possible autocorrelation, one should test for it.

 

(Not discussed in Kline): The Durbin-Watson coefficient, d, is a test for autocorrelation. The value of d ranges from 0 to 4. A value of 2 indicates no autocorrelation; 0 indicates positive autocorrelation; and 4 indicates negative autocorrelation. For a given level of significance such as .05, there is an upper and a lower d value limit. If the computed Durbin-Watson d value for a given series is more than the upper limit for the case of positive serial correlation, the null hypothesis is not rejected. If the computed d value is less than the lower limit, the null hypothesis is rejected. If the computed value is in between the two limits, the result is inconclusive. For the case of negative first-order serial correlation d must be more than (4 - d-lower-limit) to reject the null hypothesis.

 

 

9. How might the truncation of the range of a variable (ex., by reducing from 7 to 3 the number of scale points by which it is measured) affect correlation?

It typically leads to attenuation or lowering of the correlation coefficient.

 

10. How might correlation be affected by the fact two variables have differently-shaped underlying distributions (ex., right-skewed v. left-skewed, or normal v. bipolar)?

This also typically leads to attenuation. The maximum correlation will be less than 1.

 

11. How might correlation be affected when data reliability is low?

This also typically leads to attenuation.

 

12. How might nonlinearity in the relationship affect linear correlation?

Yet another factor where the measured correlation will be lower than the actual correlation.

 

13. What is an interaction effect? How are they assessed in correlation and regression?

It is a third, moderator variable whose value affects the relationship of two original variables. As such it is a form of control variable. In correlation they are explored through partial correlation. In regression they are explored through adding cross-product interaction terms to the model.

 

14. (Omitted because covered in Tacq).  Explain spurious correlation.

A correlation between two variables is fully spurious if the entire correlation can be explained by a third variable, control of which will cause the correlation of the first two to be zero. This happens under two circumstances: (1) the third variable is the common cause of the first two, which do not actually affect each other but which covary due to their common parent; or (2) the third variable intervenes between the first two and all causation must flow through it, with neither original variable affecting the other directly.

 

Spurious correlation is checked by partial correlation, using the third variable as the control. The partial correlation goes to 0 if the third variable is a full control (a common cause or an intervening variable).  Obviously, there are many situations in the middle, where a relationship is partly spurious and partly real. In these circumstances, partial correlation will drop only part way to 0 compared to the correlation of the two original variables.

 

 [pp. 30-46 are assigned for week 4, Regression]

 

                                                           

1. What is “the problem of omitted variables” in regression?

The omission of causally important variables will affect all the regression coefficients with which the omitted variable(s) is/are related. Such omission is called specification error.

 

: If relevant variables are omitted from the model, the common variance they share with included variables may be wrongly attributed to those variables, and the error term is inflated. If causally irrelevant variables are included in the model, the common variance they share with included variables may be wrongly attributed to the irrelevant variables. The more the correlation of the irrelevant variable(s) with other independents, the greater the standard errors of the regression coefficients for these independents. Omission and irrelevancy can both affect substantially the size of the b and beta coefficients. This is one reason why it is better to use regression to compare the relative fit of two models rather than to seek to establish the validity of a single model specification.

 

The specification problem in regression is analogous to the problem of spuriousness in correlation, where a given bivariate correlation may be inflated because one has not yet introduced control variables into the model by way of partial correlation.

 

Note that when the omitted variable has a suppressing effect, coefficients in the model may underestimate rather than overestimate the effect of those variables on the dependent.

 

2.  What is suppression in the context of regression?

Suppression is when the beta weight in regression or r in correlation is lower than what it would be when a control variable is introduced.  A control is introduced in regression simply by adding the omitted variable to the equation. It is introduced in correlation by partial correlation.

 

This is the opposite of the usual situation, where the omission of a variable in regression leads to a beta for an included variable which overestimates the effect of that variable. This happens because the causal importance of the omitted variable is taken on by the included variable.

 

Suppression occurs when the omitted variable is positively related to the included independent and negatively to the included dependent (or vice versa). In this situation, the covariance of the two included variables is less than it would be if somehow the third variable had no effect. When suppression is present, the betas for the included variable underestimate the power of its effect on the dependent. In fact, a beta may be 0, hiding an important independent variable’s effect!

 

When suppression is suspected, backward elimination is used as an option in stepwise regression.

 

3. What models can structural equation modeling handle that regression can’t?

Regression can’t handle models which have latent variables, nor can it handle models which posit that error (residuals) is correlated. Rather, in regression, all variables are indicator variables and correlated error violates one of its assumptions.


CHAPTER THREE: SEM FAMILY TREE

 

[pp. 47-55 is assigned for  path analysis]

 

1. In section 3.4 Kling discusses a structural model of delinquency. He notes that “A standard statistical technique that could be used here is multiple regression. Two separate analyses could be conducted....” (P. 51). What are these two analysis and how do they relate to path analysis?

The model has three exogenous variables (social class, motivation, verbal ability) and two endogenous variables (achievement, delinquency). The two analysis are the two regressions, one for each endogenous variable. In these regressions, one endogenous variable is the dependent and the independents are all the other variables with arrows going to that endogenous variable. The beta weights from these regressions are the path coefficients.

           

2. What is a disturbance term?

This is a phrase for residual error, which is (1 - R2). It represents the sum effect of all unmeasured variables. Some authors, Kline included, reserve “disturbance term” for SEM and use “residual error” for regression models.

                                                                                                                       

 

[Chapter 3, pp. 55-66 is assigned for topic 10, factor analysis]                    

 

1. What is a “measurement model”?

A measurement model is the set of causal specifications posited by the researcher, often in the form of a circles-and-arrows diagram. Arrows indicate causal effects thought by the researcher to be possible and lack of arrows, of course, indicates the posited the absence of a causal connection.

 

2. How does the measurement model differ in exploratory vs. confirmatory factor analysis? Explain in terms of Kline, Figures 3.2 and 3.3.

In EFA the researcher assumes all indicators may be related to all factors and is looking to see which indicators sort themselves out onto which factors. That is why Fig. 3.2, for EFA, has an arrow from each factor to every indicator.

 

 In CFA, the researcher posits in advance which variables are associated with which factors and is looking to see if the indicators sort as predicted. Thus in Fig. 3.3, each factor has arrows to a unique subset of indicators.

 

3. What does the double-headed arrow between the two factors mean in Kline’s measurement model figures? What type of factor rotation is implied?

It indicates correlation of the factors. The normal forms of rotation yield orthogonal factors. Because factors are here correlated, oblique factor rotation should be used to conform to the measurement model.

4. What factor analysis finding indicates convergent validity?

In CFA, a finding that indicators have high loadings on the predicted factors indicates convergent validity.

 

5. What factor analysis finding indicates discriminant validity?

In an oblique rotation, the correlation between factors is not so high (ex., > ,85) as to lead one to think the two factors overlap conceptually.

 

6. Structural equation modeling combines aspects of path analysis on the one hand and (a) EFA or (b) CFA?

CFA. The researcher specifies beforehand the relation between the indicators and the latent variables (factors). Two or more such models may be evaluated.

 

 

 

 

 

 

 


CHAPTER 4: DATA PREPARATION AND SCREENING

 

1. Why is listwise deletion recommended over pairwise deletion for handling missing values in SEM?

Listwise deletion means a case with missing values is ignored in all calculations. Pairwise means it is ignored only for calculations involving that variable. However, the pairwise method can result in correlations or covariances which are outside the range of the possible (see Kline, p. 76). This in turn can lead to covariance matrices which are singular (aka, non-positive definite), preventing such math operations as inverting the matrix, because division by zero will occur. This problem does not occur with listwise deletion. Given that SEM uses covariance matrices as input, listwise deletion is recommended (or some form of estimation of missing values, such as substituting mean values).

 

2. Why is complete or very high multicollinearity a problem in SEM?

For the same reason. Complete multicollinearity is assumed to be absent in SEM because it will result in <I>singular</I> covariance matrices, which are ones on which one cannot perform certain calculations (ex., matrix inversion) because division by zero will occur. Very high multicollinearity can result in matrix entries which approach 0 and while division can occur, reults will be unstable. Hence complete or very high multicollinearity prevents a SEM solution.

 

3. How is high multicollinearity tested?

Inspection of the correlation matrix reveals only bivariate multicollinearity, for bivariate correlations > .90. To assess multivariate multicollinearity, one uses tolerance or VIF.

 

Tolerance is 1 _ R_2 for the regression of that independent variable on all the other independents, ignoring the dependent. There will be as many tolerance coefficients as there are independents. The higher the intercorrelation of the independents, the more the tolerance will approach zero. Tolerance is part of the denominator in the formula for calculating the confidence limits on the b (partial regression) coefficient. When tolerance is close to 0 there is high multicollinearity of that variable with other independents and the b and beta coefficients will be unstable.The more the multicollinearity, the lower the tolerance, the more the standard error of the regression coefficients.

 

Variance_inflation factor, VIF VIF is the variance inflation factor, which is simply the reciprocal of tolerance. Therefore, when VIF is high there is high multicollinearity and instability of the b and beta coefficients. VIF and tolerance are found in the SPSS output section on collinearity statistics. The table below shows the inflationary impact on the standard error of the regression coefficient (b) of the jth independent variable for various levels of multiple correlation (Rj), tolerance, and VIF (adapted from Fox, 1991: 12). Note that 1.0 corresponds to no impact, 2.0 to doubling the standard error, etc. Standard error is doubled when VIF is 4.0 and tolerance is .25, corresponding to Rj = .87. This is an arbitrary but common cut_off criterion for deciding when a given independent variable displays "too much" multicollinearity.

 

4. What are outliers, and how are they detected?

Outliers are extreme, untypical cases. Often the researcher wants to explain them on some separate basis from the main model, and therefore wishes to eliminate them from the analysis.

 

Kling (p. 83) notes that alternatively, the researcher may use transforms which tend to “pull in” outliers. These include square root, logarithmic, and inverse (x = 1/x) transforms.

 

Univariate outliers can be spotted by some rule of thumb, such as cases more than 3 standard deviations from the mean. Multivariate outliers are detected by coefficients like the Mahalanobis distance or Cook’s distance.

 

The leverage statistic, h, also called the hat_value, is available to identify cases which influence the regression model more than others. The leverage statistic varies from 0 (no influence on the model) to 1 (completely determines the model). A rule of thumb is that cases with leverage under .2 are not a problem, but if a case has leverage over .5, the case has undue leverage and should be examined for the possibility of measurement error or the need to model such cases separately.

 

Cook's distance, D, is another measure of the influence of a case (see the output example). Cases with larger D values than the rest of the data are those which have unusual leverage.Fox (1991: 34) suggests as a cut_off for detecting influential cases, values of D greater than 4/(n _ k _ 1), where n is the number of cases and k is the number of independents.

 

Studentized residuals are also used to detect outliers with high leverage. The studentized residual is also called the deleted studentized residual because its calculation involves leaving out one case in turn for each of the cases. Other terms include externally studentized residual or, misleadingly, standardized residual. In a plot of studentized residuals, one may draw lines at plus and minus two standard units to highlight cases outside the range where 95% of the cases normally lie.

 

Partial regression plots, also called partial regression leverage plots or added variable plots, are yet another way of detecting influential sets of cases. Partial regression plots are a series of bivariate regression plots of the dependent variable with each of the independent variables in turn. The plots show cases by number or label instead of dots. One looks for cases which are outliers on all or many of the plots.

 

5. What type of data transforms  tend to “pull in” outliers?

 These include square root, logarithmic, and inverse (x = 1/x) transforms.

 

6. What type of data transforms tend to normalize positively skewed data?

The same ones. For negative skew, use powers.

 


CHAPTER 5: STRUCTURAL MODELS WITH OBSERVED VARIABLES AND PATH ANALYSIS: I. FUNDAMENTALS, RECURSIVE MODELS

 

[pp. 95-125, 150-154 are assigned for  path analysis]

 

[pp, 95-125]

 

1. What is the specification issue in path analysis?

If relevant causal variables are omitted, then the direct and indirect effects will not be measured accurately.

 

2. Explain parameters and observations in path models, and why a solution is impossible when there are more parameters than observations. Relate this to the concept of model identification.

Observations are coefficients the researcher has for the model. The total of observations is the total number of coefficients that can be plugged into equations used to estimate the unknown parameters, such as the path coefficients. Let v be the number of observed variables in the model. Then the number of observations is the number of variances and covariances, equal to [v(v+1)/2]. For instance, 4 variables have 4 variances and 6 unique covariances = [4(4+1)/2] = 10 observations.

 

Parameters are what can vary in the model. What can vary are the path coefficients for any arrows, and the variances and covariances of the exogenous variables and the disturbance terms. Total parameters are equal to [p + e + d], where p is the number of straight arrows in the model, denoting the paths; and e is the number of observations for the exogenous variables; and d is the number of observations for the disturbance terms.

 

If there are 4 variables in the model, p could be 6 (arrows from A to B, C, and D; from B to C and D; and from C to D). If there is one exogenous variable, then e = [1(1+1)/2] = 1. If there are three endogenous variables, then d =[3(3+1)/2] = 6.  Thus the number of parameters could be 6 + 1 + 6 = 13. Recall the number of observations for this example is 10. When there are more things that can vary in the model (parameters) than there are fixed facts (observations), the model is too complex to be solved. That is, in this case there are [13 - 10] = 3 more parameters than observations.

 

Underidentified models are ones which are not solvable because they have more parameters than observations. Recursive models are never underidentified. Recursive models are ones where the research assumes covariances of disturbance terms are all 0, and where all arrows are unidirectional (no feedback loops).  In the example above, p might still be 6 (6 arrows), e might still be 1 (variance of the one exogenous variable, but since it is assumed there are no covariances among the three disturbance terms, d would be 3 (the variances of the three disturbance terms). This is 6 + 1 + 3 = 10, which is the same as the number of observations, so the recursive model would be identified.

 

If a model is underidentified, then one must do one or more of the following: (1) simplify the model by reducing the number of arrows, and/or (2) add exogenous variables (which, of course, is usually possible only if this need is considered prior to gathering data).

 

3. In relation to parameters, what sample size does Kline recommend?

He recommends 10 times as many cases as parameters (or ideally 20 times). He states that 5 times or less is insufficient for significance testing of model effects.

 

4. What is the effect size of a disturbance term? What is its variance?

When you are computing the betas (path coefficients) for a given endogenous variable, in a regression in which it is the dependent and those with arrows to it are independents, you will also get an R2 value. The effect size of the disturbance term, which reflects unmeasured variables, is (1 -   R2), and its variance is  (1 -   R2) times the variance of that endogenous variable.

 

5. What is the estimate of the correlation and covariance between two disturbance terms?

The correlation between two disturbance terms is the partial correlation of the two endogenous variables, using as controls all their common causes (all variables with arrows to both). The covariance estimate is the partial covariance: the partial correlation times the product of the standard deviations of the two endogenous variables.

 

6. How are indirect effect sizes calculated based on betas (path coefficients)?

Simply multiply out along the path, from the starting variable through the mediating variable(s) to the dependent variable.

 

7. How does one compute the total effect size, reflecting both direct and indirect effects of one variable on another.

Simply add the direct and indirect effect sizes.

 

8. What is “effects decomposition”?

Listing direct, indirect, and total effects for each causal variable with respect to each endogenous variable.

 

9. Explain the tracing rule and model-estimated correlation.

The direct, indirect, and total effects are examples of model estimates. The tracing rule is a rule for identifying all the paths, the sum of effects of which is the estimated correlation between two variables in the model. This model-estimated correlation can be compared to the observed correlation to assess the fit of the model to the data.

 

The tracing rule is simply that the model-implied correlation between two variables in a model is the sum of all valid paths (tracings) between the two variables. These include the total effect (which is the sum of direct and indirect effects) plus any associational effects due to correlated exogenous variables. These associational effects are calculated by multiplying the correlation between the exogenous variable under consideration with a second exogenous variable, by this second exogenous variable”s total effect on the target variable under consideration. In practice, mistakes are easy and one is wise to eschew hand computation and instead rely on a model-fitting program like LISREL or AMOS to compute the model-estimated correlations and model-estimated covariances.

 

 

 

[Chapter 5, pp. 125-150, assigned for SEM I)

 

1. How does MLE relate to SEM?

 

Structural coefficients in SEM may be computed any of several ways. Ordinarily, one will get similar estimates by any of the methods.

·            MLE. Maximum likelihood estimation (MLE) is by far the most common method. Unless the researcher has good reason, this default should be taken even if other methods are offered by the modeling software. MLE makes estimates based on maximizing the probability (likelihood) that the observed covariances are drawn from a population assumed to be the same as that reflected in the coefficient estimates. Unlike OLS regression estimates, MLE does not assume uncorrelated error terms and thus may be used for non-recursive as well as recursive models.

·            Starting values. Note MLE is an iterative procedure in which either the researcher or the computer must assign initial starting values for the estimates. Poor starting values (ex., opposite in sign to the proper estimates) may cause MLE to fail to converge on a solution. Sometimes the researcher is wise to override manually computer-generated starting values.

·            MLE estimates of variances, covariances, and paths to disturbance terms. Whereas MLE differs from OLS in estimating structural (path) coefficients relating variables, it uses the same method (i.e., the observed values) as estimates for the variances and covariances of the exogenous variables. Each path from a latent endogenous variable to its disturbance term is set to 1.0, thereby allowing SEM to estimate the variance of the disturbance term.

·    OLS. Ordinary least squares (OLS). This is the common form of multiple regression, used in early, stand-alone path analysis programs. It makes estimates based on minimizing the sum of squared deviations of the linear estimates from the observed scores. However, even for path modeling of one-indicator variables, MLE is still preferred in SEM because MLE estimates are computed simultaneously for the model as a whole, whereas OLS estimates are computed separately in relation to each endogenous variable.

·    2SLS  Two-stage least squares (2SLS) is an estimation method which adapts OLS to handle correlated error and thus to handle non-recursive path models. LISREL, one of the leading SEM packages, uses 2SLS to derive the starting coefficient estimates for MLE. MLE is preferred over 2SLS for the same reasons given for OLS.

·    GLS. Generalized least squares (GLS) is an adaptation of OLS to minimize the sum of the differences between observed and predicted covariances rather than between estimates and scores. GLS and ULS (see below) require much less computation than MLE and thus were common

·    ULS. Unweighted least squares (ULS) also focuses on the difference between observed and predicted covariances, but does not adjust for differences in the metric (scale) used to measure different variables, whereas GLS is scale-invariant, and is usually preferred for this reason.

·    ADF. Asymptotically distribution-free (ADF) estimation does not assume multivariate normality (whereas MLE, GLS, and ULS) do. For this reason it may be preferred where the researcher has reason to believe that MLE's multivariate normality assumption has been violated. Note ADF estimation starts with raw data, not just the correlation and covariance matrices. ADF is even more computer-intensive than MLE and is accurate only with very large samples (200-500 even for simple models, more for complex ones).

 

2. In SEM, what is chi-square?

Chi_square. This is the most common fit test, printed by all computer programs. Chi_square tests the hypothesis that an unconstrained model (no direct arrows; variables related randomly) fits the covariance/correlation matrix as well as the given model. The chi_square value should not be significant if there is a good model fit. LISREL refers to this simply as chi_square, but synonyms include the chi_square fit index, chi_square goodness of fit, and chi_square badness_of_fit. Chi_square approximates for large samples what in small samples is called G2, the generalized likelihood ratio, which is a function of FML and sample size: chi_square = FML*(N_1), where N = sample size.

 

4. How can chi-square be misleading?

 

Note three ways in which the chi_square test may be misleading:

 

The more complex the model, the more likely a good fit. In a just_identified model, with as many parameters as possible and still achieve a solution, there will be a perfect fit. Put another way, chi_square tests the difference between the researcher's model and a just_identified version of it, so the closer the researcher's model is to being just_identified, the more likely good fit will be found.

 

The larger the sample size, the more likely the rejection of the model and the more likely a Type II error (rejecting something true). In very large samples, even tiny differences between the observed model and the perfect_fit model may be found significant.

 

The chi_square fit index is also very sensitive to violations of the assumption of multivariate normality.

 

4. What is GFI?

 

GFI is one of a couple dozen goodness-of-fit measures used to assess the merits of a SEM model.

 

Goodness_of_fit index, GFI (Jöreskog_Sörbom GFI): GFI = FML/FO, where FO is the fit function when all model parameters are zero. GFI varies from 0 to 1, but theoretically can yield meaningless negative values. A large sample size pushes GFI up. Though analogies are made to R_square, GFI cannot be interpreted as percent of error explained by the model. Rather it is the percent of observed covariances explained by the covariances implied by the model. That is, R2 in multiple regression deals with error variance whereas GFI deals with error in reproducing the variance_covariance matrix. By convention, GFI should by equal to or greater than .90 to accept the model. LISREL and AMOS both compute GFI.

 

Adjusted goodness_of_fit index, AGFI. AGFI is a variant of GFI which uses mean squares instead of total sums of squares in the numerator and denominator of 1 _ GFI. It, too, varies from 0 to 1, but theoretically can yield meaningless negative values. AGFI > 1.0 is associated with just_identified models and models with almost perfect fit. AGFI < 0 is associated with models with extremely poor fit, or based on small sample size. AGFI should also be at least .90. LISREL and AMOS both compute AGFI. AGFI's use has been declining.

 

5. Which goodness of fit test(s) should be used out of the many available?

 

Goodness of fit tests determine if the model being tested should be accepted or rejected. These overall fit tests do not establish that particular paths within the model are significant. If the model is accepted, the researcher will then go on to interpret the path coefficients in the model ("significant" path coefficients in poor fit models are not meaningful).

 

LISREL prints 15 and AMOS prints 25 different goodness_of_fit measures, the choice of which is a matter of dispute among methodologists. Jaccard and Wan (1996 87) recommend use of at least three fit tests, one from each of the first three categories in StatNotes, so as to reflect diverse criteria. Kline (1998: 130) recommends at least four tests, such as chi_square; GFI, NFI, or CFI; NNFI; and SRMR.

6.  Why might one not have a good model in spite of having a good fit on a fit index?

 

There are several reasons:

 

Each index has its own problems. That is why reporting several, not one, is the standard procedure.

 

A model can fit well overall, but particular parts may be very wrong.

 

Large samples may yield significance even for very small differences.

 

Equivalent models may fit as well or better.

 

7. How is model chi-square used to modify the researcher’s model?

 

Chi_square difference statistic. This measures the significance of the difference between two SEM models of the same data, in which one model is a subset of the other. It is simply the chi_square fit statistic for one model minus the corresponding value for the second model. The degrees of freedom (df) for this difference is simply the df for the first minus the df for the second. If chi_square difference is not significant, then the two models have comparable fit to the data.

 

Model_trimming. Most modification is by way of model_trimming, which is deleting one path at a time until a significant chi_square difference indicates trimming has gone too far. As paths are trimmed, chi_square tends to increase, indicating a worse model fit and also increasing chi_square difference. That is, a significant chi_square difference indicates dropping a path means the fit of the simpler model is significantly worse than for the more complex model. Naturally, dropping paths should be done only if consistent with theory and face validity.

 

Model_building is the opposite strategy of starting with the null model or a simple model and adding paths one at a time, retaining those which yield a significant chi_square difference. As paths are added to the model, chi_square tends to decrease, indicating a better fit and also increasing the chi_square difference. That is, a significant chi_square difference indicates adding a path means the fit of the more complex model is significantly better than for the simpler one. Adding paths should be done only if consistent with theory and face validity.

 

Non_hierarchical model comparisons. Model_building and model_trimming involve comparing a model which is a subset of another. Chi_square difference cannot be used directly for non_hierarchical models. This is because model fit by chi_square is partly a function of model complexity, with more complex models fitting better. For non_hierarchical model comparisons, the researcher should use a fit index which penalizes for complexity (rewards parsimony), such as AIC.

 

8. How is the modification index used to revise models?

 

Modification indexes (MI), also called the Lagrange Multiplier. The improvement in fit is measured by a reduction in chi_square, which makes the chi_square fit index less likely to be found significant (recall a finding of significance corresponds to rejecting the model as one which fits the data). For each fixed and constrained parameter (coefficient), the modification index is a measure of predicted decrease in chi_square if a single fixed parameter or equality constraint is removed from the model by eliminating its path, and the model is reestimated.

 

In the case of modification indexes for covariances, the MI has to do with the decrease in chi_square if the two error term variables are allowed to correlate. In the case of MI for estimated regression weights, the MI has to do with the decrease in chi_square if the path between the two variables is eliminated, no longer requiring estimation of that weight in the model. One arbitrary rule of thumb is to consider eliminating paths associated with parameters whose modification index exceeds 100. However, another common path is simply to eliminate the parameter with the largest MI, then see the effect as measured by the chi_square fit index. Naturally, eliminating paths or allowing correlated error terms should only be done when it makes substantive as well as statistical sense to do so. LISREL and AMOS both compute modification indexes.

 

Multivariate MI, also called the multivariate Lagrange Multiplier, is a variant in EQS software output, providing a modification index for allowing an entire set of structure coefficients constrained to 0 (no direct paths) in the researcher's model to be allowed to vary instead.

                                               

9. How are correlation residuals used to revise models?

 

Correlation residuals are the difference between model_estimated correlations and observed correlations. The variables most likely to be in need of being respecified in the model are apt to be those with the larger correlation residuals (the usual cutoff is > .10). Having all correlation residuals < .10 is sometimes used, along with fit indexes, to define "acceptable fit" for a model.

 

10. What are “equivalent models” and what does Kline want researchers to do about them?

 

Most models have alternative specifications which would result in the same estimated correlations and covariances among the variables. The Lee-Hershberger replacement rules detail how the researcher can respecify to construct mathematically equivalent models. Kline notes that very few researchers actually bother to compare their refined model with equivalents, a practice he condemns.

 

 

[Chapter 5, pp. 150-154. assigned for path analysis]

 

1. How is the significance of a path coefficient assessed (compare Kline with StatNotes)?

Kline gives some formulas, but it is simpler: the path coefficient is the beta weight, and the beta has the same significance as that given by SPSS for the unstandardized b coefficient.

 

2. How is the significance of a multiple-leg path assessed?

Each and all of the path coefficients must be significant. For three-leg paths or simpler, Kline gives alternative formulas (p. 150).

 

3. How do you assess the significance of the total (direct and indirect) effect of exogenous variable x on endogenous variable y?

Run a regression with y as dependent and all others as independents, leaving out any variable which mediates between x and y. The significance of the b or beta for x in this equation is a test of the significance of the total effect.              

 

 

 


Chapter 6, Structural Models with Observed Variables and Path Analysis, II: Nonrecursive Models, Multiple Group Analysis

                                                                       

1. What is a disturbance term?

It is an error term for an endogenous variable. It represents the effects on that variable of all unmeasured causes.

 

2. What is recursivity?

A model is recursive if all direct effects (straight arrows in the diagram) are one-way, without feedback loops, and the disturbance terms for the endogenous variables are uncorrelated with each other (if they are correlated, feedback loops form).

 

3. Isn’t recursivity an assumption of path analysis and SEM? Explain.

Recursivity is sometimes said to be an assumption, but it is not. All recursive models are identified and thus can be solved, whereas non-recursive models may not be identified and thus may not yield a unique solution. The point of Chapter 6 is to understand which non-recursive models may be identified.

 

4. What does it mean to say a model is “identified”? Do researchers want an underidentified or overidentified model?

A model is identified if it has mathematical properties which allow a unique solution. Researchers want more knowns than unknowns, which corresponds to an overidentified model. An underidentified model is not solvable. Note, a synonym is determined, overdetermined, underdetermined.

 

5. What is the easiest way to determine if a model is underidentified, and why should the researcher determine this before collecting data?

If the model is recursive, one may assume underidentification. Otherwises, the easiest way is to run SEM on pretest or fictional data prior to data collection, since this will usually reveal underidentification. One good reason to do this is because one solution to underidentification is adding more exogenous variables, which must be done prior to collecting data. If underidentified, the program may issue an error message (ex., failure to converge), generate non-sensical estimates (ex., negative error variances), display very large standard errors for one or more path coefficients, yield unusually high correlation estimates (ex., over .9) among the estimated path coefficients, and/or even stall or crash. The Amos package notifies the researcher of identification problems and suggests solutions, such as adding more constraints to the model.            

 

 


6.  What options does the researcher have if it is found his/her model is underidentified?

If a model is underidentified, then one must do one or more of the following (not all model fitting computer packages support all strategies): (Not all ar discussed in Chapter 6):

Eliminate feedback loops and reciprocal effects.

Specify at fixed levels any coefficient estimates whose magnitude is reliably known.

Simplify the model by reducing the number of arrows, which is the same as constraining a path coefficient estimate to 0.

Simplify the model by constraining a path estimate (arrow) in other ways: equality (it must be the same as another estimate), proportionality (it must be proportional to another estimate), or inequality (it must be more than or less than another estimate).

Consider simplifying the model by eliminating variables.

Add exogenous variables (which, of course, is usually possible only if this need is considered prior to gathering data).

If MLE (maximum likelihood estimation) is being used to estimate path coefficients, two other remedies may help, if the particular computer program allows these adjustments:

·            Substitute researcher "guesstimates" as starting values in place of computer-generated starting values for the estimates.

·            Increase the maximum number of iterations the computer will attempt in seeking convergence.

 

7. What is “empirical underidentification”?

A model can be theoretically identified but still not solvable due to such empirical problems as high multicollinearity in any model, or path estimates close to 0 in non-recursive models.

 

8. What is multiple group analysis and how does it work?

Multiple group analysis is a method of determining if a grouping variable affects a model. It may be implemented only if the same measurement model is applicable to both groups.

 

Multiple group analysis is implemented by running two separate two-group analyses, first with no constraints and then again with the constraint that the loadings for the indicator variables on their respective latent variables be the same for both groups, and/or that the path estimates be the same for the two groups, and/or that the error term variances in the two groups be equal. There is disagreement among methodologists on just which and how many constraints define "same measurement model." Regardless, this approach is called multiple group path analysis and can be extended to more than two groups.

If the goodness of fit is similar for both the constrained and unconstrained analyses, then the path coefficients for the model as applied to the two groups separately may be compared. If the fit of the constrained model is worse than that for the corresponding unconstrained model, then the researcher concludes that model direct effects differ by group.

 

9. What is two-stage least squares and how does it relate to non-recursivity?

Two-stage least squares regression (2SLS) is a method of extending regression to cover models which violate ordinary least squares (OLS) regression's assumption of recursivity, specifically models where the researcher must assume that the disturbance term of the dependent variable is correlated with the cause(s) of the independent variable(s). Second, 2SLS is used for the same purpose to extend path analysis, except that in path models there may be multiple endogenous variables rather than a single dependent variable. Third, 2SLS is an older, less-used alternative to maximum likelihood estimation (MLE) in estimating path parameters of non-recursive models in structural equation modeling (SEM).

 

Maximum likelihood estimation (MLE) is generally preferred over 2SLS for estimating path parameters in non-recursive models because the MLE estimates take the entire model into account, whereas 2SLS estimates are computed based on one portion of the model at a time. That is, MLE is a "full informational" whereas 2SLS is a "partial informational" technique. The bottom line is that for overidentified models, MLE estimates are generally better than 2SLS estimates.

 

It is true that path estimation in structural equation modeling (SEM) typically uses maximum likelihood estimation (MLE) for non-recursive models. However, two-stage least squares (2SLS), not being an iterative strategy like MLE, is faster computationally and requires less computer memory. It also does not require the computer or researcher to posit starting points for the estimates, mistakes in which may (rarely) lead to lack of convergence in MLE.

 

However, use of 2SLS probably indicates one is reading an older article, or an article by a researcher who has access to 2SLS software but not to SEM software.

 

10. What are the observations/parameters test, order condition test, and rank condition test used for? (It is not necessary to explain the mechanics of these tests).

 

These tests are used to determine in advance if a nonrecursive model is identified. The first two are necessary conditions, while the third is sufficient. It may be easier, however, simply to run a SEM package on pretest or fictional data.

 

What follows is the mechanics of the tests (probably skip in class):

 

Non-recursive models involving all possible correlations among the disturbance terms of the endogenous variables. The correlation of disturbance terms, of course, means the researcher is assuming that the unmeasured variables which are also determinants of the endogenous variables are all correlated among themselves. This introduces non-recursivity in the form of feedback loops. Still, such a model may be identified if it meets the rank condition test test, which implies it also meets the parameters-to-observations test and the order condition test. These last two are necessary but not sufficient to assure identification, whereas the rank condition test is a sufficient condition. These tests are discussed below.

Non-recursive models with variables grouped in blocks. The relation of the blocks is recursive. Variables within any block may not be recursively related, but within each block the researcher assumes the existence of all possible correlations among the disturbance terms of the endogenous variables for that block. Such a model may be identified if each block passes the tests for non-recursive models involving all possible correlations among the disturbance terms of its endogenous variables, as discussed above.

Non-recursive models assuming only some disturbance terms of the endogenous variables are correlated. Such models may be identified if it passes the parameters/observations test, but even then this needs to be confirmed by running a model-fitting program on test data to see if a solution is possible.

Tests related to non-recursive models:

Observations/parameters test:

Models are cannot be identified, and hence solvable, unless they have as many or more parameters than observations. This is an important necessary but not sufficient condition.

Observations are coefficients the researcher has for the model. The total of observations is the total number of coefficients that can be plugged into equations used to estimate the unknown parameters, such as the path coefficients. Let v be the number of observed variables in the model. Then the number of observations is the number of variances and covariances, equal to [v(v+1)/2]. For instance, 4 variables have 4 variances and 6 unique covariances = [4(4+1)/2] = 10 observations.

                  Parameters are what can vary in the model. What can vary are the path coefficients for any arrows, and the variances and covariances of the exogenous variables and the disturbance terms. Total parameters are equal to [p + c + e + d], where p is the number of straight arrows in the model, denoting the paths; c is the number of curved arrows in the model, denoting the correlations of exogenous variables or of disturbance terms; e is the number of exogenous variables, with a variance; and d is the number of disturbance terms, each with a variance.

If there are 4 variables in the model, p could be 6 (arrows from A to B, C, and D; from B to C and D; and from C to D); c could be 3 if the disturbance terms for B and C, B and D, and C and D were posited to be correlated; e would be 1 (A is the 1 exogenous variable); and d could be 3 (if each endogenous variable has a disturbance term). Thus the total parameters in this model could be 6 + 3 + 1 + 3 = 13. Recall the number of observations for this example is 10. When there are more things that can vary in the model (parameters) than there are fixed facts (observations), the model is too complex to be solved. That is, in this case there are [13 - 10] = 3 more parameters than observations, hence the model is underidentified.

Order condition test:

Excluded variables are endogenous or exogenous variables which have no direct effect on (have no arrow going to) any other endogenous variable. The order condition test is met if the number of excluded variables equals or is greater than one less than the number of endogenous variables.

Rank condition test:

Rank refers to the rank of a matrix and is best dealt with in matrix algebra. In effect, the rank condition test is met if every endogenous variable which is located in a feedback loop can be distinguished because each has a unique pattern of direct effects on endogenous variables not in the loop. To test manually without matrix algebra, first construct a system matrix, in which the column headers are all variables and the row headers are the endogenous variables, and the cell entries are either 0's (indicating excluded variables with no direct effect on any other endogenous variable) or 1's (indicating variables which do have a direct effect on some endogenous variable in the model). Then follow these steps:

Repeat these steps for each endogenous variable, each time starting with the original system matrix:

Cross out the row for the given endogenous variable.

Cross out any column which had a 1 in the row, now crossed-out, for the given endogenous variable..

Simplify the matrix by removing the crossed-out row and columns.

Cross out any row which is all 0's in the simplified matrix. Simplify the matrix further by removing the crossed-out row.

Cross out any row which is a duplicate of another row. Simplify the matrix further by removing the crossed-out row.

Cross out any row which is the sum of two or more other rows. Simplify the matrix further by removing the crossed-out row.

Note the rank of the remaining simplified matrix. The rank is the number of remaining rows. The rank condition for the given endogenous variable is met if this rank is equal to or greater than one less than the number of endogenous variables in the model.

The rank test is met for the model if the rank condition is met for all endogenous variables.

 

 

 


Chapter 7: Measurement Models and Confirmatory Factor Analysis

 

1. What distinguishes SEM from path analysis?

In path analysis, each latent variable (construct) is measured by a single indicator. SEM can be thought of as a combination of path analysis and factor analysis, with the latter used to create the latent variables which are the exogenous and endogenous variables in the model. The model-fitting programs used for SEM also allow the researcher to set any number of model constraints (ex., correlated disturbances; equal disturbance terms; etc.).

 

2. In section 7.2, Kline discusses various aspects of validity. Why, in relation to CFA or SEM?

All techniques, including CFA and SEM, are subject to errors of validity. Poor convergent validity among the indicators for a factor, for instance, may mean the model needs to have more factors. Take the time to explore validity.htm on the class website.

 

3. What is attenuation and how is it related to reliability? to CFA and SEM?

Attenuation is the probability that the estimate of r (correlation) is artificially low due to measurement error or restriction of the data range. Both lower the reliability coefficient, which may be seen as the correlation of a variable with itself. . The correction for attenuation of a correlation, rxy is a function of the reliabilities of the two variables, rxx and ryy:

rxy = rxy / [SQRT{rxxryy}]

 

Both CFA and SEM analyze correlation (and covariance) matrices. If the entries are attenuated, corresponding relationships may be underestimated.

 

4. How is CFA related to SEM?

Confirmatory factor analysis (CFA) seeks to determine if the number of factors and the loadings of measured (indicator) variables on them conform to what is expected on the basis of pre_established theory. The researcher's à priori assumption is that each factor (the number and labels of which may be specified à priori) is associated with a specified subset of indicator variables. A minimum requirement of confirmatory factor analysis is that one hypothesize beforehand the number of factors in the model, but usually also expectations about which variables will load on which factors (Kim and Mueller, 1978b: 55). The researcher seeks to determine, for instance, if measures created to represent a latent variable really belong together.

 

Confirmatory factor analysis can also mean the analysis of alternative factor models using a structural equation modeling package. While SEM is typically used to model causal relationships among latent variables, it is equally possible to use SEM to explore CFA measurement models. SEM packages allow the researcher to specify any of a wide variety of model constraints, estimate path coefficients, then assess the goodness_of_fit between estimated and observed coefficients as a gauge of the merit of alternative models. If only the paths from the latent variables (factors) to their respective indicators are examined, with paths between latent variables specified as 0 (orthogonal) or allowed to correlate (oblique) but are not represented as one_way causal paths, then SEM is being used to evaluate CFA models.

 

Using SEM, the researcher can explore CFA models with or without the assumption of certain correlations among the error terms of the indicator variables. Such measurement error terms represent causes of variance due to unmeasured variables as well as random measurement error. Depending on theory, it may well be that the researcher should assume unmeasured causal variables will be shared by indicators or will correlate, and thus SEM testing may well be merited. That is, including correlated measurement error in the model tests the possibility that indicator variables correlate not just because of being caused by a common factor, but also due to common or correlated unmeasured variables. This possibility would be ruled out if the fit of the model specifying uncorrelated error terms was as good as the model with correlated error specified. In this way, testing of the confirmatory factor model may well be a desirable validation stage preliminary to the main use of SEM to model the causal relations among latent variables.

 

Using SEM, the redundancy test is to use chi_square difference (discussed in the section on structural equation modeling) to compare an original multifactor model with one which is constrained by forcing all correlations among the factors to be 1.0. If the constrained model is not significantly worse than the unconstrained one, the researcher concludes that a one_factor model would fit the data as well as a multi_factor one and, on the principle of parsimony, the one_factor model is to be preferred.

 

Using SEM, the measurement invariance test is to use chi_square difference to assess whether a set of indicators reflects a latent variable equally well across groups in the sample. The constrained model is one in which factor loadings are specified to be equal for each class of the grouping variable. If the constrained model is not significantly worse, then the researcher concludes the indicators are valid across groups. This procedure is also called multiple group CFA. If the model fails this test, then it is necessary to examine each indicator for group invariance, since some indicators may still be invariant. This procedure, called the partial measurement invariance test is discussed by Kline (1998: 225 ff.).

 

Using SEM, the orthogonality test is similar to the redundancy test, but factor correlations are set to 0. If the constrained model is not significantly worse than the unconstrained one, the factors in the model can be considered orthogonal (uncorrelated, independent). This test requires at least three indicators per factor.

 

 

5. When is a confirmatory factor analysis (CFA) model identified in SEM?

CFA models in SEM have no causal paths (straight arrows in the diagram) connecting the latent variables. The latent variables may be allowed to correlate (oblique factors) or be constrained to 0 covariance (orthogonal factors). CFA analysis in SEM usually focuses on analysis of the error terms of the indicator variables (see previous question and answer). Like other models, CFA models in SEM must be identified for there to be a unique solution.

 

In a standard CFA model each indicator is specified to load only on one factor, measurement error terms are specified to be uncorrelated with each other, and all factors are allowed to correlate with each other. One_factor standard models are identified if the factor has three or more indicators. Multi_factor standard models are identified if each factor has two or more indicators.

 

Non_standard CFA models, where indicators load on multiple factors and/or measurement errors are correlated, may nonetheless be identified. It is probably easiest to test identification for such models by running SEM for prestest of fictional data for the model, since SEM programs normally generate error messages signaling any underidentification problems. Non_standard models will not be identified if there are more parameters than observations. (Observations equal v(v+1)/2, where v is the number of observed indicator variables in the model. Parameters equal the number of unconstrained arrows from the latent variables to the indicator variables [unconstrained arrows are the one per latent variable constrained to 1.0, used to set the metric for that latent variable], plus the number of two_headed arrows in the model [indicating correlation of factors and/or of measurement errors], plus the number of variances [which equals the number of indicator variables plus the number of latent variables].) Note that meeting the parameters >= observations test does not guarantee identification, however.

 

6. Do severe departures from normal distributions matter in SEM?

Multivariate normal distribution of the indicators: That is, each indicator should be normally distributed for each value of each other indicator. Even small departures from multivariate normality can lead to large differences in this chi_square test, undermining its utility. In general, simulation studies (Kline, 1998: 209) suggest that under conditions of severe non_normality of data, SEM parameter estimates (ex., path estimates) are still fairly accurate but corresponding significance coefficients are too high. This means that for significance tests of parameters, there is a bias toward Type II errors (considering the parameter significant when it is not). Recall that for the chi_square test of goodness of fit of the model as a whole, the chi_square value should not be significant if there is a good model fit. Lack of multivariate normality inflates the chi_square statistic such that the overall chi_square fit statistic for the model as a whole is biased toward Type I error (rejecting a model which should not be rejected).