The following analysis is part of Udacity’s Data Analyst Nanodegree program and requires students to use Python’s Pandas and Numpy libraries to do some basic data analysis on Kaggle’s Titanic Passenger dataset.

My analysis focused first on general statistics that you would expect for a dataset like this (distribution of age, gender, class, etc.) and then went further into who had better survival rates between different groups. This project was very illuminative in that it gave some insight into the data analysis process and forced me to understand the limitations of my analysis and to be careful not to make too many assumptions about the data or about my conclusions.

In the future I would like to apply some classification algorithms to help predict which groups were more likely to have survived based on a series of characteristics.

The analysis can be found here in the repository.

Many more ideas/projects are forthcoming. Let’s build some cool stuff together!

I’ve collaborated with some amazing teams to deliver:

High-quality mobile/web applications for a diverse range of clients

Robust firmware for telecommunications equipment

Implementation and support for ETL processes for an enterprise-level data pipeline

(Whatever the requirements call for)

Many more ideas/projects are forthcoming. Let’s build some cool stuff together!

Many more ideas/projects are forthcoming. Let’s build some cool stuff together!

Jan 12 Quick Analysis of the Titanic Passenger Dataset

Jan 19 OpenStreetMap Data Wrangling Project

Jan 5 MARTA Simulator: An End-to-End Database System

I’ve collaborated with some amazing teams to deliver:

High-quality mobile/web applications for a diverse range of clients

Robust firmware for telecommunications equipment

Implementation and support for ETL processes for an enterprise-level data pipeline

(Whatever the requirements call for)

Many more ideas/projects are forthcoming. Let’s build some cool stuff together!