Predicting track popularity with Spotify data and Python
Data analysis and visualizationThis exploratory data analysis with this Spotify dataset is my capstone project for a data science course I took through Brainstation. Given a sample of 160,000 tracks, I applied some of the data science techniques I learned to predict a track’s popularity.
My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.
Quick rundown of what I did:
My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.
Quick rundown of what I did:
- Import a few Python libraries made for data analysis, visualizations, and modeling.
- Clean the dataset to make sure the data is in a usable format.
- Do a quick exploration to get a sense of what kind of tracks are included in the sample and what their audio attributes are.
- Create a few charts to develop a hypothesis for which audio attributes impact popularity.
- Test my hypothesis with a few basic data models.
- Present the project in class, listen to feedback, and make revisions.


