Artists’ bodies of work are like icebergs: We almost never see more than the 10% sticking above the water. For Edvard Munch, that means most folks are familiar with The Scream, and maybe a dozen or so of his other 1,800+ paintings, but the rest remain unseen. How much can we really know about an artist by seeing only a handful of his works? If you only saw The Scream you might think all of Munch's paintings were of genderless zombies doing Macaulay Culkin impressions, as I did. But as we will see, this is far from the case.
We put all of Edvard Munch's paintings into a searchable online database so that art lovers can get the added context of seeing all of Munch's paintings, the entire iceberg, not just what shows above water.
In this blog, post we share our experience using machine learning to tag objects within the paintings in an attempt to expand search capabilities beyond traditional metadata like title, dimensions, materials, etc.
What Should We Search For?
To help us decide which features we should search for within the paintings, we created a word cloud using the titles of Munch’s 1800+ paintings to see if it would bring out any key themes.
We then grouped the words with the highest frequencies into thematic clusters, narrowing it down to eight features for our machine learning model:
Person/No Person
Female/Male
Clothed /Nude
Standing/Seated
Interior/Exterior
Trees/Snow/Water
Buildings and/or Vehicles
And for fun: Hat/No Hat
Using Clarifai's General Model for Image Recognition
The folks at Clarifai have been very supportive of our early attempts at image recognition, so we decided to start with their general model for image recognition. The general model "recognizes over 11,000 different concepts including objects, themes, moods, and more."
Clarifai's general model is primarily designed for working with image recognition in photography and video as apposed to paintings, which lead us to interesting results. The model produced a high frequency of tags like "mural" (199), "abstract" (352), "image" (338), "portrait" (279), and "texture" (270), which while accurate, are not particularly useful to us. There were also a lot of false positive tags around spirituality like "God" (435), "religion" (974), and "veil" (381).
Munch's Two Women in White on the Beach below is a good example of a painting that returned high prediction scores for art terms that were not useful and religious terms that were not accurate.
Side note, I once made this same mistake after not reading the homework in art history class. I suggested to the class that the moon with its reflection in the water was symbolic of the Crucifixion. The whole class laughed out loud as my professor corrected me, letting me know it is commonly understood to symbolize a phallus and that it was clear I did not do the assigned reading. Apparently the machine learning model did not do the reading, either.
We thought we might be able to improve the accuracy of the general model for our purposes and reduce the less relevant tags by building a custom model within Clarifai trained on paintings instead of photographs.
Building a Custom Image Recognition Model
Clarifai makes it easy to train a custom image recognition model using machine learning in a user-friendly interface. We started by dumping 700 of Edvard Munch's paintings in to Clarifai's custom model builder. They recommend you tag 10 images (50 images for best results) with the concept you are looking for, as well as several images without the concept. We started by tagging paintings with and without Trees. This was quick and easy to do in the interface and no programming was required. We saw positive results right away with our custom Munch model correctly identifying many paintings as depicting trees.
Encouraged, we then went on to other concepts, like hats.
Recognizing Hats: A Hairy Problem
Using image recognition to identify hats turned out to be tricky. The problem with hats is differentiating what is hair from what is a hat. Not an easy task in many of Munch's paintings. Take his portrait of Helge Rode below from 1908 as an example.
Is he wearing a hat? Our custom model put it at about a 30% chance and to be honest we'd tend to agree. It was not until we looked up a photograph of Helge Rode (above right) that we realized Mr. Rode just had a very impressive and unusual head of hair and was likely not wearing a hat. Paintings like this that required manual intervention helped us sympathize with what the model was up against.
Below you can see the three paintings that scored highest for "hat" by our model.
Snow was another troublesome concept. Similar to hats, the snow concept identified a lot of false positives. Pretty much any painting with large patches of white, or early sketches or unfinished works, were thought to have snow. There were other cases where I could not identify if there was snow using my own eyes. The model gave the painting below a 25% chance of being snow. I couldn't say for sure just by looking. Fortunately the title "Street Workers in Snow" helped to disambiguate. As my wife put it, the machine learning model was magic, but not "that" magic, meaning there will always be limitations.
For reference, the three highest scoring paintings for snow are below.
In addition to tagging trees, hats, and snow, we had reasonable success detecting men, woman, buildings, boats, and water.
Why Does This Matter?
Tagging objects within paintings makes it possible to curate similar works across the artists' entire output with a simple search. Let's say you wanted see all the paintings by Munch depicting women. If you were to search based on titles only, here's what you'd get:
Frequency of Terms in Paintings Titles
- 54 paintings including "woman"
- 17 paintings including "women"
- 19 paintings including "girl"
- 14 paintings including "female
--------------------
- 104 total
With our added tags, searching for "female" brings up 542 paintings with women, more than a 5x increase in results. As the model improves, this number will increase more and more.
Not only will this make it easier to locate works using common terms like "trees" or "water," but it also opens the door to make new analytical comparisons. Did the ratio of men to women Munch paint shift over his career? Did he create more paintings of the indoors or the outdoors? How does his indoor/outdoor ratio compare to Van Gogh? How many Munch paintings contain fruit? What type of fruit? How many paintings are of nudes vs. clothed people? Are there more nudes of women than men? If so, by how many?
Conclusion
It turns out my favorite paintings by Munch, an artist known for his psychologically painful images of death and suffering, are his brilliant sunrises. They make me happy every time I look at them, yet I'd have never known they exist without the opportunity to search deeper across all of Munch's paintings. Hopefully the database will help other people explore and discover new favorite paintings and develop a more nuanced appreciation for Edvard Munch and his artwork.
What features would you like to see tagged in the Munch paintings and in paintings by artists that will be added to the database in the future? Do you have recommendations on how to improve the accuracy of the model? Join the Artnome community to ask questions, weigh-in on topics, or help us improve the database.