Capstone project update

Getting Satellite Data

In short, getting satellite data from the government is a big headache. The trouble includes links that loop you around to the original site, multiple sites for a given satellite/project, weird registration for data access, and verbose and obscure variable names. I guess this is part of being a data scientist.

Research from NREL on the topic

These slides include a list of resources of satellites with cloud information. When I’m done downloading the data, I will do cloud computing in the clouds on the clouds!

(Big) Data Science Happenings

Big Data! Cloud Computing!

So, I’ve been learning quite a bit at Galvanize this past week about Spark and AWS. Today it culminated in deploying Spark on multiple clusters on AWS to process large files. Spark has a growing number of machine learning models available, so you can do machine learning in the cloud!

Earlier this week I deployed a small AWS instance and installed Anaconda on it. When running IPython Notebook from AWS, I used a password to protect it. It’s really freaking cool that you can remotely access IPython Notebook! The only problem that I had was that matplotlib didn’t display plots. This was solved by installing the ubuntu-desktop which loaded the qt backend necessary for matplotlib to make plots.

Capstone Project!

I’ve really got to start buckling down on this capstone project. Thanks to Galvanize instructors Isaac and Clayton for bouncing ideas today!