Category Archives: Galvanize

Capstone project

I just finished my fourth week at Zipfian Academy (Galvanize data science immersive) and I’ve been learning a lot about machine learning algorithms such as random forests, boosting, and support vector machines that are used in data classification. I looked at some Python code again that I was trying to figure out a few months ago that employed machine learning and I understood it better. Practice makes perfect they say. And there’s no shortage of practice at Zipfian!

While the exercises can be instructive, I learn the most when working on projects. And the biggest project at Zipfian is our capstone project. We have a month to work on a dataset (or datasets) for a specific subject. Currently, I have several ideas for my project. I had been working on EEG data before Zipfian, so the analysis uses machine learning but it is not really in the same spirit of “web scraping social data to find interesting insights” demonstration of data science flexing its muscles.

I come from a solar energy background, so it would be interesting to do a project looking at solar panel failure rate and find contributing factors. Given that food shortage is a current and future worldwide issue, I would like to look improve the efficiency of a vertical farm – an indoor, completely controlled environment for farming – by analyzing data collected by sensors such as light level, water level, pH, electricity usage, air flow, etc. This, I believe, would be very interesting and relevant as California continues to experience water shortage. However, there may not be enough information on this yet, so I may have to use data for regular old outside data. I’ll keep you posted as I hone in on a project idea. Leave a comment if you have an idea or have a lead for an interesting dataset!