How modern is the MoMA?
Data analysis and visualizationAs my capstone project for Parsons’ data visualization certificate, I investigated how the Museum of Modern Art’s collection grew over time.
Data cleansing, analysis, and visualization done with Tableau. Visual design in Sketch.
View the full project here.
![](https://freight.cargo.site/t/original/i/d4ab97cc16b5c07415561787064d59a23b4be836e00ea5d269360fb93523c1aa/MoMA.png)
Going in questions first
At the time of this project, MoMA’s public dataset on Github contained ~139k works, out of ~200k in the full collection. It has basic metadata for each work, including title, artist, date made, medium, dimensions, and date acquired.While digging through the data, I wanted to ask how modern is the MoMA?
The fact that there were 2 time series caught my eye.
- When art was created wasn’t straightforward. Instead of dates, this column was free text where anything goes. It could be date ranges, estimates, or a paragraph of context.
- When MoMA acquired art was comprehensive and cleanly formatted. Not only were there dates, but I could see sources/methods: purchased, donated, gifted by the artist, etc.
![Acquisitions from 1929 to 2020. What are those spikes?](https://freight.cargo.site/t/original/i/06e72f262c1e03ae6ab157f2ac44ad86482fb391abb25edc23510893d8a85a1e/MoMA_Graph-1.png)
![How the museum acquires art.](https://freight.cargo.site/t/original/i/ffee055417a303f0506e11f78273d63ba71f8cb2c474b07659acb330f490bf57/MoMA_Graph-2.png)
![Breakdown by department.](https://freight.cargo.site/t/original/i/642ec2609640fb853bfa1784007f494302a34222c5a6b49562de56f4a36b02e9/MoMA_Graph-3.png)
Pulling quantitative data from free text
It’d be interesting to compare the two time series, but I needed to get dates created formatted as years (numerical data) first.The data cleaning process was trial and error, and I spotted a few patterns with how dates were written. Spacing, punctuation, and formatting usage was surprisingly consistent, so these “rules” helped me create a script that splices and concatenates text.
Eventually, I was able to:
Fortunately, this is the Museum of Modern Art — there’s a limited date range, so checking for mistakes was easy.
- Get the year created for 97.5% of the records. For the remaining 2.5%, there wasn’t a known year or date range to begin with.
- Calculate the gap between dates acquired and created for 92.7% of records. The remaining 4.8% had only one of the two.
Fortunately, this is the Museum of Modern Art — there’s a limited date range, so checking for mistakes was easy.
Finding a story within the data
I then looked into the relationship between these variables:- The year something was created
- The year MoMA acquired it
- What kind of work it is, based on department. Medium and materials were so varied and specific that classification didn’t say much.
I then created a dashboard to present my findings to the class. Their comments helped me pick out interesting bits:
- Year created was more varied that expected. If MoMA acquired something 100 years old, is it still modern?
- There were a few acquisition spikes. What did the MoMA acquire in (very) large bulk? Where did it come from?
- When data was broken down by department, there were some drastic differences I wanted to call out.
View the interactive dashboard here.
![Page 1 is an overview of the collection.](https://freight.cargo.site/t/original/i/71a7ebd42b4bf3a1fd26f78820b2cd88d15c2af9be367b46b62abc69af8c5da6/MoMA_Dashboard-1.png)
![Page 2 breaks down time trends by department.](https://freight.cargo.site/t/original/i/aa30f4ead717f157b765c704aee0880bcfb805f6f11a2f0096279a25fd90f458/MoMA_Dashboard-2.png)
Putting the story together
Starting with sketching helped me figure out what kind of story to tell.- Write a general outline with a few questions to answer
- Draw some primary visualizations
- Pick a few ideas to develop into wireframes
- Add supporting visualizations and annotations
- Alternate between writing and visual design until it’s done.
![Starting with loose sketches.](https://freight.cargo.site/t/original/i/7835882f010ee1ca3a8d7a19d85c421fac1fafffb99885e2d628ee8418f3b303/MoMA_Sketches1.png)
![Adding structure and details.](https://freight.cargo.site/t/original/i/2a1bd3c60b242ab4acc64696b487fd457176a23f48232f440427fd041060d8fb/MoMA_Sketches-2.png)
The final design
I worked some of the potential ideas above into the final narrative, either as their own sections or as interesting annotations:- Introduce MoMA, the data set, and the collection.
- What kind of art and design are in the collection? Which departments are represented?
- How did the collection change over time? Which departments came and went?
- How modern is the collection? How modern is each department?
View the project here.
![The final layout.](https://freight.cargo.site/t/original/i/c309a19800416191be685efdcc328b1c9fc37ef9eac26a34f283b5256e982955/MoMA_Final.png)