Data Files
All data files in the book are simulated data, except for market basket data in Chapter 12 and a public web site log in Chapter 14 (2nd edition). This is because data simulation is a valuable skill that allows exploration, and because few marketing data sets are publicly available. In particular, note that no proprietary data from Drexel, Google, Wharton, or other affiliations of the authors was used in this book.
Data for each chapter can be downloaded on demand using the commands in the book. Alternatively, to work offline you could download the files separately -- either individually as .CSV files or all at once in a single .ZIP.
Recommended Data Access Methods
- Create the data using R code as described in the text. This is slowest but you will learn the most. (For exercises, download the data as noted in each exercise.)
- Download with short URLs as noted in the book, such as
read.csv("http://goo.gl/UDv12g")
. See Appendix E in the book (Appendix D in the 1st edition) for complete details and cross-references of data files.
Note for 1st edition readers: Use the data sets as provided here for the 2nd edition; they are a superset of the 1st edition data. All references to data in the 1st edition are unchanged.
Other data access methods
- Advised: Use one of the two recommended options above.
- Alternative for offline access, individual chapters: choose from the folder of all data files. Right click and save the CSV file to a desired folder; then use
setwd()
to set your R working directory andread.csv()
to access the data. - Alternative for offline access, all chapters: get the single ZIP file of data (warning: downloads immediately). Unzip it and follow the instructions above for single files.
- Offline access, all code and data: get the single ZIP file of code and data (warning: downloads immediately). Unzip it and follow the instructions above for single files.
When reading CSV files locally, you may need to use the row.names=1
argument to avoid having a variable X that represents the row numbers.
Example Code Using a Downloaded CSV File
Assuming you have downloadedsatData.csv
to the /users/chris
folder:
> setwd("/users/chris")
> sat.data <- read.csv("satData.csv", row.names=1)
> summary(sat.data)
iProdSAT iSalesSAT Segment iProdREC iSalesREC
Min. :1.00 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:3.00 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:3.000 1st Qu.:3.000
Median :4.00 Median :4.000 Median :3.000 Median :4.000 Median :3.000
Mean :4.13 Mean :3.802 Mean :2.844 Mean :4.044 Mean :3.444
3rd Qu.:5.00 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.:5.000 3rd Qu.:4.000
Max. :7.00 Max. :7.000 Max. :4.000 Max. :7.000 Max. :7.000