How modern is the MoMA?
Data analysis and visualizationAs my capstone project for Parsons’ data visualization certificate, I investigated how the Museum of Modern Art’s collection grew over time.
I used Tableau for cleaning, analyzing, and visualizing data, and Sketch for the final layout.
View the full project here.
data:image/s3,"s3://crabby-images/2418d/2418d99be3d1597676283d8510e2015379af6e78" alt=""
Going in questions first
At the time of this project, MoMA’s public dataset on Github contained ~139k works, out of ~200k in the full collection. It has basic metadata for each work, including title, artist, date made, medium, dimensions, and date acquired.While digging through the data, I wanted to ask how modern is the MoMA?
The fact that there were 2 time series caught my eye.
- When art was created wasn’t straightforward. Instead of dates, this column was free text where anything goes. It could be date ranges, estimates, or a paragraph of context.
- When MoMA acquired art was comprehensive and cleanly formatted. Not only were there dates, but I could see sources/methods: purchased, donated, gifted by the artist, etc.
data:image/s3,"s3://crabby-images/be582/be582ea723621224fea61d24fbaf1d2cb65c86af" alt="Acquisitions from 1929 to 2020. What are those spikes?"
data:image/s3,"s3://crabby-images/4283c/4283c24ac8c4c70b161db12c2d83c340f2a49290" alt="How the museum acquires art."
data:image/s3,"s3://crabby-images/66654/6665403a309b25de64a217a0b7ed65b4a8166fad" alt="Breakdown by department."
Pulling quantitative data from free text
It’d be interesting to compare the two time series, but I needed to get dates created formatted as years (numerical data) first.The data cleaning process was trial and error, and I spotted a few patterns with how dates were written. Spacing, punctuation, and formatting usage was surprisingly consistent, so these “rules” helped me create a script that splices and concatenates text.
Eventually, I was able to:
- Get the year created for 97.5% of the records. For the remaining 2.5%, there wasn’t a known year or date range to begin with.
- Calculate the gap between dates acquired and created for 92.7% of records. The remaining 4.8% had only one of the two.
Fortunately, this is the Museum of Modern Art — there’s a limited date range, so checking for mistakes was easy.
Finding a story within the data
I then looked into the relationship between these variables:- The year something was created
- The year MoMA acquired it
- What kind of work it is, based on department. Medium and materials were so varied and specific that classification didn’t say much.
I then created a dashboard to present my findings to the class. Their comments helped me pick out interesting bits:
- Year created was more varied that expected. If MoMA acquired something 100 years old, is it still modern?
- There were a few acquisition spikes. What did the MoMA acquire in (very) large bulk? Where did it come from?
- When data was broken down by department, there were some drastic differences I wanted to call out.
View the interactive dashboard here.
data:image/s3,"s3://crabby-images/52246/522469f00985a8377c7c131e6b3350db9801e4f1" alt="Page 1 is an overview of the collection."
data:image/s3,"s3://crabby-images/f123f/f123ffea93cd53e0f1fbbf3c89a148b5722814a6" alt="Page 2 breaks down time trends by department."
Putting the story together
Starting with sketching helped me figure out what kind of story to tell.- Write a general outline with a few questions to answer
- Draw some primary visualizations
- Pick a few ideas to develop into wireframes
- Add supporting visualizations and annotations
- Alternate between writing and visual design until it’s done.
data:image/s3,"s3://crabby-images/698fc/698fc16cb27cdff303f9673399f9e421dae3288a" alt="Starting with loose sketches."
data:image/s3,"s3://crabby-images/6254a/6254a7ad3c6e2a3106c29fce3a75fe0ea63ea3ec" alt="Adding structure and details."
The final design
I worked some of the potential ideas above into the final narrative, either as their own sections or as interesting annotations:- Introduce MoMA, the data set, and the collection.
- What kind of art and design are in the collection? Which departments are represented?
- How did the collection change over time? Which departments came and went?
- How modern is the collection? How modern is each department?
View the project here.
data:image/s3,"s3://crabby-images/6b52c/6b52c36c2fc0d352cf39c94fee4cca173c47bfc7" alt="The final layout."