12.2 Example Data
The following data are included in the R Core will be used as illustrative of the concepts in this chapter. This chapter assumes that you know about data frames introduced in Section 10.
Some of the data are combined into a data frame, called dat, for easier viewing.
rm(list=ls())
# documentation for data ?state
data(state) # loads state data into the workspace
<- data.frame(state.abb,state.division,state.region,state.x77)
dat str(dat)
'data.frame': 50 obs. of 11 variables:
$ state.abb : chr "AL" "AK" "AZ" "AR" ...
$ state.division: Factor w/ 9 levels "New England",..: 4 9 8 5 9 8 1 3 3 3 ...
$ state.region : Factor w/ 4 levels "Northeast","South",..: 2 4 4 2 4 4 1 2 2 2 ...
$ Population : num 3615 365 2212 2110 21198 ...
$ Income : num 3624 6315 4530 3378 5114 ...
$ Illiteracy : num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ...
$ Life.Exp : num 69 69.3 70.5 70.7 71.7 ...
$ Murder : num 15.1 11.3 7.8 10.1 10.3 6.8 3.1 6.2 10.7 13.9 ...
$ HS.Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ...
$ Frost : num 20 152 15 65 20 166 139 103 11 60 ...
$ Area : num 50708 566432 113417 51945 156361 ...
The output of the str() function of dat for the state data shows a num for numerical quantities. Variables from Population to Area in dat are numerical. Population or Income, for example, are shown as integer variables, while Illiteracy and Life Expectancy are continuous variables.
The variables in dat for state.abb, state.division, and state.region are categorical variables and shown as factors in the str().