Book Review – Machine Learning for Hackers, Conway and White, O’Reilly

In Machine Learning for Hackers by Drew Conway and John Myles White, the reader is introduced to a number of techniques useful for creating systems that can understand and make use of data. While the book has solid topical material and is written in a fluid and easy to read manner, I don't feel that this book is really for hackers, unless the definition of hacker is vastly different from "programmer".

Much of the text is taken up explaining how to parse strings, change dates, and otherwise munge data into shape to be operated on by statistical functions provided by R. In fact, there is so much of the book in that fashion that I end up skipping through large portions to get back to something that is worth spending time reading about. I can't understand why a programmer would need significant education in string parsing. I was also put off by the vast amount of text explaining basic statistics. Maybe a recent computer science graduate is simply the wrong reader for this book?

I think it is certainly possible to learn the basic principles of machine hacking from this book, and even to put them to good use with R in the same manner displayed in the examples. Indeed, the code and data available for this book would be very useful as prep for an introductory course at an academic institution. To make the best use of the text, you really should be sitting at your computer, reading the text side by side with the code, and operating on the data with R as instructed to do.

Personally, I found that wading through this text wasn't enjoyable it due to the lack of density of material at the depth I was looking for. Other readers may find it is just right for them, but I suspect those readers would not be hackers, contrary to the implication of the title. As best as I can figure, this book would best serve a student scientific researcher who wanted to understand what machine learning was about, and did not have significant prior experience in programming or statistics. Alternatively, if you are significantly distant in years from your time in statistics, or consider learning R one of your goals, this book could work well for you.

If this sounds like you, you can get it from O'Reilly. I wrote this post as part of their O'Reilly Blogger Review program, which is neat.

I should note that I read this book on the iPhone as an ePub. There were some formatting problems with tables that were distracting, but otherwise it was readable.