Skip to main content

Doing Data Science- Straight Talk from the Frontline (2013) - ⭐⭐⭐

·345 words·2 mins
Books Hard-Sciences Dead-Tree-Book Tech

Metadata
#

  • Author(s): Cathy O’Neil, and Rachel Schutt
  • Number of pages: 405
  • Year published: 2013
  • Year read: 2017

Review
#

Thus endeth my lunchtime reading book. I intermittently read this, over the course of many months, usually over a sandwich at lunch. For this style of reading, it holds up well: the chapters are discreet packets of data science chat. That said, I agree with other critiques of this book: if you’re an aspiring data scientist, this book is NOT sufficient to get you off the ground. It’s not a good beginner’s book. It’s maybe a good “pop data science” book, a pre-beginner’s book. It’s very light on the technical stuff, and, if anything, it’s more like an anthropological survey of the state of the field.

Each chapter covers a technique or common challenge or strategy, describes the general jist of what’s going on, and then points you in the direction of papers, other books, or tutorials online. Early chapters have some “exercises”, though they’re more like general pointers of “oh, you could try this, I guess?” Later chapters don’t even bother.

For an O’Reilly book, I was disappointed that the GitHub repo didn’t have, for example, the code examples mentioned in the book, or the exercises and toy datasets. (What? Are we supposed to manually copy down several pages of R code?!) Or even just a README.md with a bibliography (given how many shortened Google links are used as citations)? This makes it a starkly UNFRIENDLY book, which is weird since O’Reilly books (well, the good ones) can be very, very rich resources. This, instead, felt thin - and the repo is basically pointless.

I will say that I enjoyed the banter-y tone of the book, and some of the discussions of techniques (e.g. there was a great, intuitive explanation of Principal Component Analysis) and “real world” issues (e.g. how Kaggle competitions are basically data science in a vacuum; what it’s like to be a lady data scientist) were quite good. But, overall, yeah, this isn’t really a “good enough” data science book.

📚 Goodreads

Related

Think Stats (2011) -
·234 words·2 mins
Books Hard-Sciences Did-Not-Finish Tech
Weapons of Math Destruction- How Big Data Increases Inequality and Threatens Democracy (2016) - ⭐⭐⭐⭐⭐
·767 words·4 mins
Books Audiobook Tech Econ
How to Think Like a Computer Scientist- Learning With Python (2002) - ⭐⭐⭐⭐
·131 words·1 min
Books Tech