Book Review: Machine Learning with R


Summary: Mixes code and concepts very well. Consistent format between each topic chapter. Great for R newbies and R pros looking for insight into a specific model or package.

[table]
,
Get this book if…, You want an intro to R and to data mining algorithms or you need a simple reference for using packages like e1071 and kernlab.
Don’t get this book if…, You’re expecting a deep technical dive into machine learning or implementing the algorithms yourself.
[/table]

Machine Learning with R

  • This is probably my all time favorite R book. I like it for quite a few reasons.
  • Very clear and thorough code samples.
  • Concepts are explained well, implemented, and then immediately examined and improved.
  • References where to learn more.
  • Even after using R for years, I took,away a few new tricks I had never seen.

This book is at the top of my Data Science Reading list for a reason! As with all books, there are some cons to this book but they’re dependent on the audience.

This is not an academic text. Concepts are explained well but much of the math is skipped.

My Takeaways

You don’t have to work at a startup to be able to write a great book on data mining.  Brett Lantz works at (at the time of writing – 3/24/2016) the University of Michigan.  Despite not being in Silicon Valley, he still has a written an amazing book.

The one caveat to this is that if he worked at a startup, the book probably would have gotten more attention.  Not that it’s right!  But it probably would have.

Learning data mining is more than code: Each chapter provides a good, but basic, introduction to the algorithmic topic.  Then it launches into an example problem but it always follow the same pattern of exploring the data – which is great!  It’s easy to forget that your job as an analyst is not to churn out models.  It’s to think up solutions to problems.

Parallel processing in R is hard or at least harder than I thought it would be.  In the very last chapter, there is a section on distributed computing.  It was my first exposure to how multiple processors could work in R.  It’s not as easy as I had hoped it would be.

Bottom Line: Get this book if you want to learn more, you feel rusty, or you see an algorithm you’ve been struggling with.  I use it as a reference.  Even as the book ages, you are still pointed to the right package each time.