Structural Equation Analysis: Path Analysis

Author

StatMind

Published

January 7, 2024

1 Intro

Path models consist of independent and dependent variables depicted graphically by rectangles.

Variables that are independent variables, and not dependent variables, are called exogenous.

Graphically, exogenous variables have only arrows pointing to (endogenous) variables. No arrows point at exogenous variables.

Endogenous variables are:

  1. Strictly dependent variables, or

  2. Both independent and dependent.

Graphically, endogenous variables have at least one arrow pointing at them.

We will use an example from the field of high-performance organizations (HPOs). De Waal has developed an HPO model in which organizational performance is indicated by five factors:

  1. Continuous Innovation (CI)

  2. Openness & Action Orientation (OAO)

  3. Management Quality (MQ)

  4. Employee Quality (QEMP)

  5. Long-Term Orientation (LTO).

For illustrating path analysis, we will use a sample data set with three of the five factors (MQ, QEMP, and OAO). For 250 organizations, we have data on the three indicators, and on two aspects of performance, namely financial performance (FPF) and non-financial performance (NFPF). All variables are measured on a five point scale.

Note

Actually, all measures - including the three HPO factors - are composite measures (not directly observed latent variables), based on observed answers to a set of questions (items). In another StatSnip we will discuss the ways to include latent variables.

Our model, or path diagram, is:

With regression analysis, we can estimate models with one dependent variable. Of course, for the model above we can run two separate regression models, but that would not be the same (e.g., we wouldn’t know if and how the dependent variables are related). The challenge of using regression analysis would become bigger, if we would include mediating variables (like an effect of MQ on OAO).

In this StatSnip, we will:

  1. Show you how regression analysis is just a special case of a more general path analysis.

  2. Show you how to perform path analysis in STATA.

  3. Show you how to achieve the same results in the free R-software, using the lavaan package.

  4. Discuss model comparisons, model improvements, and the use of goodness-of-fit statistics.

2 Regression Analysis and Path Analysis

Regression analysis estimates the relationship between one dependent and one or more independent variables. Path analysis is an extension of regression models, and can include more than one endogenous variable and mediating variables that are both explained in the model (by other exogenous or endogenous variables) and explain other (endogenous) variables.

Even though a path analysis with one endogenous variable, and, say, three exogenous variables is equivalent to a regression model with one dependent and three independent variables, the commands and functions use terms and layouts of the results that make users feel uncomfortable.

So, what we will do first, is to estimate the model below, using regression analysis and path analysis, in STATA and in R.

2.1 The Data

Let us first read the data.

Code
library(readstata13)
hpo250 <- read.dta13("hpo250.dta")
hpo250
Code
cormat <- cor(hpo250) # Correlation matrix of the 5 variables
round(cormat, 2)      # Print, in 2 decimals    
       mqr qempr oaor fpfr nfpfr
mqr   1.00  0.29 0.65 0.51  0.45
qempr 0.29  1.00 0.24 0.54  0.35
oaor  0.65  0.24 1.00 0.42  0.39
fpfr  0.51  0.54 0.42 1.00  0.35
nfpfr 0.45  0.35 0.39 0.35  1.00
Code
summary(hpo250)       # Key descriptives   
      mqr            qempr            oaor            fpfr          nfpfr      
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.00   Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:3.00   1st Qu.:3.000  
 Median :3.000   Median :3.000   Median :3.000   Median :3.00   Median :4.000  
 Mean   :3.064   Mean   :2.976   Mean   :3.132   Mean   :3.26   Mean   :3.404  
 3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:4.00   3rd Qu.:4.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.00   Max.   :5.000  

The data set contains the responses from 250 respondents, on the five variables. All variables end with r (e.g., management quality mq becomes mqr), to indicate that all variables have been recoded from into a 5-point scale (1=very poor, 5=very good), for sake of ease. Above, we have added summary information on the five variables, and the correlations between the variables.

2.2 The Model

We want to estimate a regression model with one dependent variable (fpfr) and three independent variables (mqr, qempr, and oaor).

We use the sembuilder in STATA to draw the diagram.

The regression model is:

\[ fpfr = \beta_{0} + \beta_{1}*mqr + \beta_2*qempr + \beta_{3}*oaor \]

The nice thing about the sembuilder is that the model can be built graphically, and estimated directly. Behind the scenes, STATA builds the command which then can be adapted if needed.

Tip

It is often faster to use commands rather than the Graphical Unit Interface (sembuilder). With more complex models, it takes several steps to come to the final model, and making the changes in the diagram consumes more time. It is best to make the base model and the final model using the GUI, and everything in between in commands (preferably in a STATA DO-file).

If indeed the two approaches are identical, then (among others):

  • The paths in SEM should be the same as the regression coefficients in regression analysis

  • The explained variance (coefficient of determination \[R^{2}\]) should be the same.

Well, everything is the same. But since - in both STATA and R - the output stemming from both approaches are way different, it is a bit of a challenge to reach that conclusion.

2.2.1 STATA

Regression in STATA produces the type of output that you may be familiar with:

The (standardized) regression coefficients coefficients are in the circle marked green. The \[R^{2}\] is in the bluish oval.

The (standardized) regression coefficients are the same as the paths in the diagram below. Again, we have emphasized them in green. The equivalent of the coefficient of determination in regression analysis, is 1 minus the (unexplained) variance in the diagram: \[1 - .57=.43\].

The output of SEM entails more than what is depicted in the diagram. It is possible to display a lot of it in the diagram, but it easily gets crowded, and the main aim of the diagram is to focus the attention of your audience on the key aspects of the analysis.

Detailed output is shown below.

2.2.2 R (lavaan)

The advantages of STATA for using SEM are (i) the excellent documentation, and the (ii) the GUI (sembuilder). The disadvantage of commercial software like STATA and SPSS is that it doesn’t come cheap.

The lavaan package of R offers a great free-of-cost alternative. There is to the best of our knowledge no GUI available (yet), but this is minor drawback. In most cases, we use sembuilder for the final model only - and R does offer additional packages to achieve more or less the same.

Below, we have used the lavaan package, to estimate the regression model as a path analysis. A description of the package and a user-friendly tutorial with examples can be found here.

Let’s start by running a regression model, using the lm() function of R. In order to get the standardized regression coefficients, we install the lm.beta package. The function lm.beta() applied to the model that is fit with lm(), prodices the standardized regression coefficients.

Code
# Regression (lm())
fitreg <- lm(fpfr ~ mqr + qempr + oaor,data=hpo250)
summary(fitreg)

Call:
lm(formula = fpfr ~ mqr + qempr + oaor, data = hpo250)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.52880 -0.53291  0.01955  0.56789  1.28019 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.50253    0.15471   9.712  < 2e-16 ***
mqr          0.18690    0.03876   4.822 2.49e-06 ***
qempr        0.28770    0.03432   8.382 4.04e-15 ***
oaor         0.10492    0.05538   1.895   0.0593 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6652 on 246 degrees of freedom
Multiple R-squared:  0.4334,    Adjusted R-squared:  0.4265 
F-statistic: 62.72 on 3 and 246 DF,  p-value: < 2.2e-16
Code
library(lm.beta) # To obtain the standardized regression coefficients
lm.beta(fitreg)

Call:
lm(formula = fpfr ~ mqr + qempr + oaor, data = hpo250)

Standardized Coefficients::
(Intercept)         mqr       qempr        oaor 
         NA   0.3083728   0.4212617   0.1193715 

Let’s now turn to lavaan. Don’t forget to install the lavaan package before using it!

Code
# Regression (lavaan)
# install.packages("lavaan", dependencies = TRUE)
library(lavaan)
This is lavaan 0.6-17
lavaan is FREE software! Please report any bugs.
Code
model_reg2 <- '# regression
              fpfr ~ mqr + qempr + oaor
             '
fit2 <- cfa(model_reg2, data=hpo250)
summary(fit2, standardized = TRUE)
lavaan 0.6.17 ended normally after 1 iteration

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                         4

  Number of observations                           250

Model Test User Model:
                                                      
  Test statistic                                 0.000
  Degrees of freedom                                 0

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  fpfr ~                                                                
    mqr               0.187    0.038    4.861    0.000    0.187    0.308
    qempr             0.288    0.034    8.450    0.000    0.288    0.421
    oaor              0.105    0.055    1.910    0.056    0.105    0.119

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .fpfr              0.435    0.039   11.180    0.000    0.435    0.567

You will find that R gives the same results as STATA.

In lavaan, you work in two steps.

  1. In step 1, you formulate the complete model. For a simple regression model, we only have to specify the regression model which is identical to what you feed into the lm() function: fpfr ~ mqr + qempr + oaor.

  2. In the second step, you use lavaan’s CFA() function to estimate the model defined in step 1.

The standardized coefficients (paths), appear in the column Std.all.

2.3 A Path Model With Two Endogenous Variables

Hopefully, you will now feel confident that (whether you ar using STATA or R), regression analysis is just a very special case of path analysis.

Path analysis enables you to estimate models which are more complex, and also more realistic, than a regression model with one independent variable.

For example, let’s have a look at the model we showed at the beginning, in which we have added non-financial performance (nfpfr) as a second dependent variable. There’s nothing wrong with the term dependent variable, but for consistency, it is better to speak of endogenous variables.

Below, we show the diagram produced by STATA.

If you don’t have STATA at your disposal, then we can use lavaan to get the same results. Note that we have explicitly added covariances (“~~”) between the exogenous in the model. This does not affect the results, since covariances are assumed. Adding them to the model sees to it that they are displayed in the output.

Code
model_reg3 <- '# regression two dependent variables
               fpfr  ~ mqr + qempr + oaor
               nfpfr ~ mqr + qempr + oaor
               # covariances
               mqr   ~~ qempr
               mqr   ~~ oaor
               qempr ~~ oaor
              '
fit3 <- cfa(model_reg3, data=hpo250)
summary(fit3, standardized = TRUE)
lavaan 0.6.17 ended normally after 28 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        15

  Number of observations                           250

Model Test User Model:
                                                      
  Test statistic                                 0.000
  Degrees of freedom                                 0

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  fpfr ~                                                                
    mqr               0.187    0.038    4.861    0.000    0.187    0.308
    qempr             0.288    0.034    8.450    0.000    0.288    0.421
    oaor              0.105    0.055    1.910    0.056    0.105    0.119
  nfpfr ~                                                               
    mqr               0.208    0.052    4.022    0.000    0.208    0.289
    qempr             0.188    0.046    4.117    0.000    0.188    0.233
    oaor              0.157    0.074    2.123    0.034    0.157    0.150

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  mqr ~~                                                                
    qempr             0.538    0.122    4.398    0.000    0.538    0.290
    oaor              0.932    0.109    8.578    0.000    0.932    0.646
  qempr ~~                                                              
    oaor              0.303    0.083    3.644    0.000    0.303    0.237
 .fpfr ~~                                                               
   .nfpfr             0.018    0.037    0.477    0.633    0.018    0.030

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .fpfr              0.435    0.039   11.180    0.000    0.435    0.567
   .nfpfr             0.787    0.070   11.180    0.000    0.787    0.728
    mqr               2.092    0.187   11.180    0.000    2.092    1.000
    qempr             1.647    0.147   11.180    0.000    1.647    1.000
    oaor              0.995    0.089   11.180    0.000    0.995    1.000

Scrolling through the output:

  1. The regressions part shows the paths (regression coefficients). The standardized coefficients are in the Std.all column. The first block contains the coefficients for the endogenous variable fpfr, and the second block for nfpfr. They are identical (in to decimals precision) to paths in the figure above.

  2. The standardized covariances (correlations, that is), between the exogenous variables mqr, qempr, and oaor are the same as the numbers at the two-headed arrows on the left of the figure.

  3. We have specified a covariance (or correlation, since we use standardized results) between the error terms of the two endogenous variables. This correlation is close to zero (.03), and not significantly different from zero. We might leave it out of the model, by constraining it to zero.

  4. The variances of interest are those of the error terms of the endogenous variables. The variance of the error term of fpfr (.fpfr, in the output) is .567, and hence the explained variance is 1 - .567 = .433. The variance of nfpfr explained by the model is 1 - .728 = .272.

A part of the output that we have not discussed yet, is the test statistic, and the degrees of freedom. What essentially happens is that the model estimates the model parameters, given the matrix of variances and covariances in the data. In a saturated model, we have a system of equations with as many unknowns as we have parameters - and the model fits the data perfectly.

2.4 Non-Saturated Models

We can simplify the model, by leaving out some of the paths, and/or covariances. Leaving out the paths, is synonymous to constraining them to zero. By doing so, we have more degrees of freedom in fitting the model. At the same time, our model may not fit the data, e.g., if a left out path is in fact significantly different from zero.

The logic of your model should follow theory. If literature has found, or if you hypothesize, that mqr has no bearing on non-financial performance, and oaor has no effect on financial performance, and in addition financial and non-financial performance are uncorrelated, given the exogenous variables, you can estimate and test the following model:

Since we have left out two paths, and constrained covariance between the error terms of the endogenous variables to zero, we have gained three degrees of freedom. This allows us to test our model.

The test uses a chi-square statistic, based on the differences between the matrix of variances and covariances of the variables in the analysis, and the implied matrix based on the parameter estimates. Ideally, our model replicates the data perfectly, and the chi-square is close to zero. Here, our chi-square is significantly different from zero, and, hence, the model does not fit the data very well.

Tip

There has been a longstanding discussion on the best set of statistics for evaluating the model fit. The outcome of that discussion is that it wise to look at a couple of statistics. For an overview, see the article by Hoopen et al (2008). The bottom line is that the researcher should use and report:

  1. The chi-square statistic with its degrees of freedom, and the p-value.

  2. The Root Mean Square Error of Approximation (RMSEA) should be smaller than .07.

  3. The Standard Root Mean Square Residual (SRMR) is ideally lower than .05 but values up to .08 are considered acceptable.

  4. Comparative Fit Index (CFI), should be higher than .95 (although some use .90).

These measures can be obtained in lavaan.

Below, we estimate the model in R, with lavaan. Note that in the code, we can just leave out the paths that we assume to be zero. Constraining the covariance between the error terms of the endogenous variables, is specified by multiplying the right hand side of the covariance (“~~”) to zero.

Code
model_reg4 <- '# regression two dependent variables
               fpfr  ~ mqr + qempr
               nfpfr ~       qempr + oaor
               # variances and covariances
               mqr   ~~ qempr
               mqr   ~~ oaor
               qempr ~~ oaor
               # residual correlations
               fpfr ~~ 0*nfpfr
              '

fit4 <- cfa(model_reg4, data=hpo250)
summary(fit4, fit.measures=TRUE)
lavaan 0.6.17 ended normally after 26 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        12

  Number of observations                           250

Model Test User Model:
                                                      
  Test statistic                                19.521
  Degrees of freedom                                 3
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                               379.632
  Degrees of freedom                                10
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.955
  Tucker-Lewis Index (TLI)                       0.851

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -1724.379
  Loglikelihood unrestricted model (H1)      -1714.618
                                                      
  Akaike (AIC)                                3472.758
  Bayesian (BIC)                              3515.016
  Sample-size adjusted Bayesian (SABIC)       3476.975

Root Mean Square Error of Approximation:

  RMSEA                                          0.148
  90 Percent confidence interval - lower         0.090
  90 Percent confidence interval - upper         0.214
  P-value H_0: RMSEA <= 0.050                    0.004
  P-value H_0: RMSEA >= 0.080                    0.972

Standardized Root Mean Square Residual:

  SRMR                                           0.052

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  fpfr ~                                              
    mqr               0.232    0.030    7.657    0.000
    qempr             0.292    0.034    8.538    0.000
  nfpfr ~                                             
    qempr             0.222    0.046    4.790    0.000
    oaor              0.341    0.060    5.710    0.000

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
  mqr ~~                                              
    qempr             0.538    0.122    4.398    0.000
    oaor              0.932    0.109    8.578    0.000
  qempr ~~                                            
    oaor              0.303    0.083    3.644    0.000
 .fpfr ~~                                             
   .nfpfr             0.000                           

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .fpfr              0.442    0.040   11.180    0.000
   .nfpfr             0.838    0.075   11.180    0.000
    mqr               2.092    0.187   11.180    0.000
    qempr             1.647    0.147   11.180    0.000
    oaor              0.995    0.089   11.180    0.000
Code
summary(fit4, standardized = TRUE)
lavaan 0.6.17 ended normally after 26 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        12

  Number of observations                           250

Model Test User Model:
                                                      
  Test statistic                                19.521
  Degrees of freedom                                 3
  P-value (Chi-square)                           0.000

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  fpfr ~                                                                
    mqr               0.232    0.030    7.657    0.000    0.232    0.384
    qempr             0.292    0.034    8.538    0.000    0.292    0.428
  nfpfr ~                                                               
    qempr             0.222    0.046    4.790    0.000    0.222    0.274
    oaor              0.341    0.060    5.710    0.000    0.341    0.327

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  mqr ~~                                                                
    qempr             0.538    0.122    4.398    0.000    0.538    0.290
    oaor              0.932    0.109    8.578    0.000    0.932    0.646
  qempr ~~                                                              
    oaor              0.303    0.083    3.644    0.000    0.303    0.237
 .fpfr ~~                                                               
   .nfpfr             0.000                               0.000    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .fpfr              0.442    0.040   11.180    0.000    0.442    0.575
   .nfpfr             0.838    0.075   11.180    0.000    0.838    0.775
    mqr               2.092    0.187   11.180    0.000    2.092    1.000
    qempr             1.647    0.147   11.180    0.000    1.647    1.000
    oaor              0.995    0.089   11.180    0.000    0.995    1.000

From the massive output, we conclude:

  1. The effects of the exogenous variables are significantly different from zero.
  2. All effects are moderate to strong, in the .3 to .4 order of magnitude.
  3. The model fit is poor. The chi-square statistic is significantly different from zero. The RMSEA is .149, which is well above the upper limit of .07.

2.5 Model Improvement

If, like above, we have a model with a poor fit, we can try to improve it by adding loosening the constraints. In our model, we have constrained three parameters to zero, which explains the 3 degrees of freedom in the model.

One strategy to improve the model, is by computing what would happen to our chi-square statistic if we would do away with the constraint.

In SEM jargon, we use modification indices to determine the model improvement of of adding constrained parameters. A problem of listing modification indices, is that the software doesn’t think. For example, linking the endogenous variables can be done in various ways: by adding (back) the covariance that we constrained to zero; by adding a path from fpfr to nfpfr, or a path the other way round. All of these options have some impact. That is, they do improve the model. Whether the path or parameter makes sense, is a different matter.

Below, we use the modindices() function, to detect improvements that are substantial and make sense. We can store the output in an object, which is a data frame (with labeled variables). In order to narrow down the output, we can apply a filter. For example, in the code we only consider improvements that have either fpfr or nfpfr on the left hand side (variable lhs) in a regression (variable op == “~”).

Code
mi <- modindices(fit4, sort = TRUE, maximum.number = 20)
mi
Code
mi[(mi$lhs=="fpfr" | mi$lhs=="nfpfr") & mi$op=="~",]

Our selection contains 4 rules. The last two lines suggest paths from fpfr to nfpfr, or the other way round. A path from fpfr to nfpfr (we could adjust nfpfr ~ qemprr + oaor, into nfpfr ~ qempr + oaor + fpfr) would decrease chi-square by 2.4 (out of 19.5). By doing so, we would add an indirect effect of the exogenous variables mqr and qempr, via fpfr, on nfpfr. A more substantial improvement can be obtained by adding one of the two omitted paths: the path from mqr to nfpfr. The chi-square will go down from 19.5 to 19.5 minus 15.2.

So, let’s do that!

Code
# Uncorrelated error terms
model_reg5 <- '# regression two dependent variables
               fpfr  ~ mqr + qempr
               nfpfr ~ mqr + qempr + oaor
               # variances and covariances
               mqr   ~~ qempr
               mqr   ~~ oaor
               qempr ~~ oaor
               # residual correlations
               fpfr ~~ 0*nfpfr
              '

fit5 <- cfa(model_reg5, data=hpo250)
summary(fit5, fit.measures=TRUE)
lavaan 0.6.17 ended normally after 28 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        13

  Number of observations                           250

Model Test User Model:
                                                      
  Test statistic                                 3.849
  Degrees of freedom                                 2
  P-value (Chi-square)                           0.146

Model Test Baseline Model:

  Test statistic                               379.632
  Degrees of freedom                                10
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.995
  Tucker-Lewis Index (TLI)                       0.975

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -1716.543
  Loglikelihood unrestricted model (H1)      -1714.618
                                                      
  Akaike (AIC)                                3459.086
  Bayesian (BIC)                              3504.865
  Sample-size adjusted Bayesian (SABIC)       3463.654

Root Mean Square Error of Approximation:

  RMSEA                                          0.061
  90 Percent confidence interval - lower         0.000
  90 Percent confidence interval - upper         0.153
  P-value H_0: RMSEA <= 0.050                    0.316
  P-value H_0: RMSEA >= 0.080                    0.457

Standardized Root Mean Square Residual:

  SRMR                                           0.019

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  fpfr ~                                              
    mqr               0.232    0.030    7.657    0.000
    qempr             0.292    0.034    8.538    0.000
  nfpfr ~                                             
    mqr               0.208    0.052    4.022    0.000
    qempr             0.188    0.046    4.117    0.000
    oaor              0.157    0.074    2.123    0.034

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
  mqr ~~                                              
    qempr             0.538    0.122    4.398    0.000
    oaor              0.932    0.109    8.578    0.000
  qempr ~~                                            
    oaor              0.303    0.083    3.644    0.000
 .fpfr ~~                                             
   .nfpfr             0.000                           

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .fpfr              0.442    0.040   11.180    0.000
   .nfpfr             0.787    0.070   11.180    0.000
    mqr               2.092    0.187   11.180    0.000
    qempr             1.647    0.147   11.180    0.000
    oaor              0.995    0.089   11.180    0.000

By adding one parameter, the model’s chi-square goes down to 3.85 (not exactly 19.5 minus 15.2; the MI’s are approximations), with 2 degrees of freedom. The p-value is 0.146, which means that it’s not significantly different from zero. Our data matches the model implied variance-covariance matrix quite well.

The other goodness–of-fit statistics are also good. In our report we can say that:

  • The χ2 (df=2) = 3.85 (p=.15), indicating a good fit.
  • The RMSEA = 0.06, below the upper limit of .07. The probability (Pclose) of RMSEA being smaller than 0.05 is .32.
  • The SRMR = 0.02, which suggests a good fit.
  • The CFI is very close to 1.