`data science portfolio`

*Most of my professional data science work is closed source. However, you can find my educational and personal projects below.*

`The shape of text`

python (nltk), nlp, javascript, d3.js, html/css

Final project for CS171 (Visualization) at Harvard. I scraped 800+ science fiction short stories from Strange Horizons, conducted some basic natural language processing, and visualized the results.

`How do we price risk?`

python (pandas, sklearn), supervised machine learning (random forests)

Final project for CS109A (Introduction to Data Science) at Harvard. Collaborated with my friend, Quinn. We took peer-to-peer lending data and tried predicting the interest rates on individual loans.

`Predicting movie genres`

image processing, deep learning (convolutional neural nets)

Final project for CS109B (Advanced Topics in Data Science) at Harvard. Collaborated with Quinn (again), Pranav, and Johanna. We scraped movie posters from The Movie Database's API, processed those images with`numpy`

and predicted their likely categories (as a multi-label problem) using a convolutional neural net (in`keras`

).

`blog/tag/data-science`

statistical theory, bayesian stats, linear algebra doodles, and so much more!

Since November 2017, I've started blogging on what inspires and amuses me in data science, statistics, math, art, and programming. This is also where I'll post smaller personal projects.