Jenny Ho




Making property data approachable at Archipelago



Improving access to social services at Healthify



Personal projects

Personal projects


Predicting song popularity

Data analysis and visualization

This exploratory data analysis with Spotify data was my capstone project for a data science course I took through Brainstation in 2020. Given a sample of 160,000 tracks, I applied some of the data science techniques I learned to predict a track’s popularity.

My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.

Quick rundown of what I did:
  1. Import a few Python libraries made for data analysis, visualizations, and modeling.
  2. Clean the dataset to make sure the data is in a usable format.
  3. Do a quick exploration to get a sense of what kind of tracks are included in the sample and what their audio attributes are.
  4. Create a few charts to develop a hypothesis for which audio attributes impact popularity.
  5. Test my hypothesis with a few basic data models.
  6. Present the project to my class.

Forming a hypothesis.
Graphing relationships among audio attributes.
Creating a data model with linear regression.