Category Archives : Programming


Overview of Parallel Processing in R

Summary: The foreach package provides parallel operations for many packages (including randomForest). Packages like gbm and caret have parallelization built into their functions. Other tools like bigmemory and ff solve handling large datasets with memory management.

Test accuracy from using rpart in parallel foreach

Select Census Geographies include State, Zip Code, MSPA, Congressional District and More.

Get US Census Data with R

Summary: The US Census provides an API that lets you query any of their datasets. Includes population by race, gender, age, and more by zip code, state, congressional district, and a few other geographies.


Using the caret package in R

Summary: The caret package was developed by Max Kuhn and contains a handful of great functions that help with parameter tuning. Purpose of the caret Package The caret package lets you quickly automate model tuning.  Using a training and holdout sample, the caret package trains a model you provide and returns the optimal model based […]


The Passcode Riddle: A Parallel Example in R

Summary: The passcode riddle asks for three three whole positive numbers with each one being equal to or larger than the next. Turns out there are only a handful of numbers this could possibly work for. Browsing YouTube one morning, I came across the video from TED-Ed and I was intrigued! I’ll be honest, I […]

Passcodes 1 thru 1,000 and Factors