Monthly Archives: March 2016


Build a Reporting Swipe File

Summary: Building a repository of good report components helps you quickly assemble reports that work.  Typical things to watch for are: Opening statements, summary sections, key takeaways, useful dimensions and metrics, and recommendations.

Core Elements of Reports

Wordcloud generated in R for Brother's Grimm Stories

Text Mining Packages and Options in R

Summary: The tm and lsa packages provide you a way of manipulating your text data into a term-document matrix and create new, numeric features.  The ngram package lets you find frequent word patterns (e.g. “The cow” is a bi-gram or 2-gram; “The cow said” is a tri-gram or 3-gram).  Lastly, for a quick visualization (though […]


Free Data Mining and Data Science Books

I’m on a bit of a reading kick as of late so I wanted to compile a short list of some useful and free data mining / data science books.  Most are of a technical nature and come from academia Free Academic Texts on Data Mining An Introduction to Statistical Learning with Applications in R: Covers […]


Density and CDF Plots from Iris Data Set

Book Review: Data Analysis with Open Source Tools

I’ve had this book on my (digital) shelf for a long time.  It’s an intimidating tome.  It’s big and broad.  However, it’s not exactly what I was expecting.  As the book title explains, the main focus is data analysis.  Not necessarily statistical or data mining analysis.  Instead, chapters one through eleven are focused on plotting, mathematical […]