Summary: Logistic regression produces coefficients that are the log odds. Take e raised to the log odds to get the coefficients in odds. Odds have an exponential growth rather than a linear growth for every one unit increase. A two unit increase in x results in a squared increase from the odds coefficient. To get […]
Summary: The simplest way of of getting a data.frame to a transaction is by reading it from a csv into R. An alternative is to convert it to a logical matrix and coerce it into a transaction object.
Summary: The foreach package provides parallel operations for many packages (including randomForest). Packages like gbm and caret have parallelization built into their functions. Other tools like bigmemory and ff solve handling large datasets with memory management.
Summary: The US Census provides an API that lets you query any of their datasets. Includes population by race, gender, age, and more by zip code, state, congressional district, and a few other geographies.
Summary: The caret package was developed by Max Kuhn and contains a handful of great functions that help with parameter tuning. Purpose of the caret Package The caret package lets you quickly automate model tuning. Using a training and holdout sample, the caret package trains a model you provide and returns the optimal model based […]