Errata for 2nd edition

General note. R packages often change their computation algorithms, function names, and requirements. If your code doesn't work or the output doesn't match what is shown in the book, see Section 1.6.4 in the book for notes and ideas.

For 1st edition errata, see Errata for 1st edition only.

Specific errata and package update notes

Page Change
(throughout, including 36, 49, 77, 80, 112, 133, 158, 188, 194, 220, 224, 230, 241, 248, 262, 263, 271, 282, 295, 296, 302, 338, 356, 361) R 4.0 changed its handling of nominal/factor variables when reading data frames. To match previous results, you may add stringsAsFactors=TRUE to calls of read.csv(). In older versions of R (prior to R 4.0 in May 2020), text strings were converted by default to categorical factor variables. Starting in R 4.0 they are read as raw text and not converted to factors.

To obtain results as shown in the book, stringsAsFactors=TRUE must be added to occurrences of read.csv() unless it is already specified otherwise (such as being set to FALSE).

For example, on page 49, you may obtain the results shown in the book by using this command:

store.df <- read.csv("http://goo.gl/QPDdMl", stringsAsFactors=TRUE)  # added stringsAsFactors=TRUE
371, 379 Errors with the mlogit.data function. The package mlogit version 1.10 changed its data structure, such that code that calls mlogit.data() gives an error about the dfidx package (a new data indexing package used by mlogit). This is fixed in the latest version of our code files.

More specifically, the following line from the book produces an error:

cbc.mlogit <- mlogit.data(data=cbc.df, choice="choice", shape="long", 
+                         varying=3:6, alt.levels=paste("pos", 1:3), 
+                         id.var="resp.id")

The error it produces is this:

Error in guess(varying) : 
  failed to guess time-varying variables from their names

This is due to the package shifting to a new index function (from the dfidx package), and a now unnecessary specification of varying columns (which is now only needed for wide data). The solution is to remove the varying column specification and to add choice set indices that are unique across all respondents.

For example, instead of having choice sets numbered 1:12 within each respondent, they should be numbered 1:12 for respondent 1, then 13:24 for respondent 2, and so forth. In short, use these lines of code instead of the one above from the book:

library(dfidx)      # install if needed
# add a column with unique question numbers, as needed in mlogit 1.1+
cbc.df$chid <- rep(1:(nrow(cbc.df)/3), each=3)
# shape the data for mlogit
cbc.mlogit <- dfidx(cbc.df, choice="choice", 
+                   idx=list(c("chid", "resp.id"), "alt" ))


Alternatively, you may use any of the following approaches for CBC data:

  • Simply download and use the new code files as noted above.
  • Install an older version of mlogit (such as mlogit 1.0-1). Installing an older package can be complex and may require developer tools (such as a gcc compiler). We are not able to provide assistance with that process, but see here for more details. Assuming you have the required tooling, the following will install mlogit 1.0-1, which works with the code in our book:

    install.packages(c("devtools", "lmtest", "statmod"))
    library(devtools)
    install_version("mlogit", version="1.0-1", repos="http://cran.us.r-project.org")
                                    
  • Or, skip the sections about mlogit estimation (e.g., Sections 13.3.2, 13.4.1), and use hierarchical Bayes estimation instead (Section 13.5).
(throughout)

Random data generation: Just after publication of the book in April 2019, R versions 3.6.0 and later changed the way random numbers are generated. In many chapters we simulate data and perform other functions that use random numbers. Those results will change slightly from what the book shows. Options include:

  • To match the book, when using downloaded data: no action is needed. You will notice few differences, except for minor details such as results from the some() function, and slightly different results in some Bayesian statistics (which use randomization).

  • To match the book exactly (especially when simulating the data as we recommend): give the following command after starting R and before running code from the book:

                                    RNGversion("3.5.0")
                                    
  • To see how things change: just go ahead and use R's new default random number generator. Compare results to the book. They will be slightly different in the exact data points, yet quite similar for the overall statistical results.

  • If you're interested to read more about the reason for the change, see Bias in R's random integers?
  • 109 Question 11 in the Exercises will not compute in the latest version of psych::polychoric. Answer the following question instead: what does the error message tell you?

    Bug reports

    What How
    Report a suspected bug
    Include the chapter and page, with a reproducible example.
    email: cnchapman+rbug@gmail.com
    or better, report to the bugs mailing list.
    Join the bugs mailing list Sign up here
    Check mailing list archives Bug archives