Predicting track popularity with Spotify data and Python
Data analysis and visualizationThis exploratory data analysis with this Spotify dataset is my capstone project for a data science course I took through Brainstation. Given a sample of 160,000 tracks, I applied some of the data science techniques I learned to predict a track’s popularity.
My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.
Quick rundown of what I did:
My project is on Kaggle, where you can see my full process, code, visualizations, and commentary. Samples of my original Jupyter notebook are below.
Quick rundown of what I did:
- Import a few Python libraries made for data analysis, visualizations, and modeling.
- Clean the dataset to make sure the data is in a usable format.
- Do a quick exploration to get a sense of what kind of tracks are included in the sample and what their audio attributes are.
- Create a few charts to develop a hypothesis for which audio attributes impact popularity.
- Test my hypothesis with a few basic data models.
- Present the project in class, listen to feedback, and make revisions.
data:image/s3,"s3://crabby-images/b8152/b8152772773a008d387b576c3454f039c0a849a2" alt="Forming a hypothesis."
data:image/s3,"s3://crabby-images/ba176/ba1763f65a873adbdd41e3f32acf192c11c5305a" alt="Graphing relationships among audio attributes."
data:image/s3,"s3://crabby-images/b97b9/b97b9cdea69cfacf5aa2e1d9b7a5116431552788" alt="Creating a data model with linear regression."