Predicting track popularity with Spotify data and Python
Data analysis and visualizationThis exploratory data analysis with this Spotify dataset is my capstone project for a data science course I took through Brainstation. Given a sample of 160,000 tracks, I applied some of the data science techniques I learned to predict a track’s popularity.
My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.
My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.
Quick rundown of what I did:
- Import a few Python libraries made for data analysis, visualizations, and modeling.
- Clean the dataset to make sure the data is in a usable format.
- Do a quick exploration to get a sense of what kind of tracks are included in the sample and what their audio attributes are.
- Create a few charts to develop a hypothesis for which audio attributes impact popularity.
- Test my hypothesis with a few basic data models.
- Present the project in class, listen to feedback, and make revisions.