Data Files

All data files in the book are simulated data, except for market basket data in Chapter 12 and a public web site log in Chapter 14 (2nd edition). This is because data simulation is a valuable skill that allows exploration, and because few marketing data sets are publicly available. In particular, note that no proprietary data from Drexel, Google, Wharton, or other affiliations of the authors was used in this book.

Data for each chapter can be downloaded on demand using the commands in the book. Alternatively, to work offline you could download the files separately -- either individually as .CSV files or all at once in a single .ZIP.

Recommended Data Access Methods

  • Create the data using R code as described in the text. This is slowest but you will learn the most. (For exercises, download the data as noted in each exercise.)
  • Download with short URLs as noted in the book, such as read.csv("http://goo.gl/UDv12g"). See Appendix E in the book (Appendix D in the 1st edition) for complete details and cross-references of data files.

Note for 1st edition readers: Use the data sets as provided here for the 2nd edition; they are a superset of the 1st edition data. All references to data in the 1st edition are unchanged.

Other data access methods

  • Advised: Use one of the two recommended options above.
  • Alternative for offline access, individual chapters: choose from the folder of all data files. Right click and save the CSV file to a desired folder; then use setwd() to set your R working directory and read.csv() to access the data.
  • Alternative for offline access, all chapters: get the single ZIP file of data (warning: downloads immediately). Unzip it and follow the instructions above for single files.
  • Offline access, all code and data: get the single ZIP file of code and data (warning: downloads immediately). Unzip it and follow the instructions above for single files.

When reading CSV files locally, you may need to use the row.names=1 argument to avoid having a variable X that represents the row numbers.

Example Code Using a Downloaded CSV File

Assuming you have downloaded satData.csv to the /users/chris folder:

> setwd("/users/chris")
> sat.data <- read.csv("satData.csv", row.names=1)
> summary(sat.data)
    iProdSAT      iSalesSAT        Segment         iProdREC       iSalesREC    
 Min.   :1.00   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:3.00   1st Qu.:3.000   1st Qu.:2.000   1st Qu.:3.000   1st Qu.:3.000  
 Median :4.00   Median :4.000   Median :3.000   Median :4.000   Median :3.000  
 Mean   :4.13   Mean   :3.802   Mean   :2.844   Mean   :4.044   Mean   :3.444  
 3rd Qu.:5.00   3rd Qu.:5.000   3rd Qu.:4.000   3rd Qu.:5.000   3rd Qu.:4.000  
 Max.   :7.00   Max.   :7.000   Max.   :4.000   Max.   :7.000   Max.   :7.000