Studying cultural variation in recollections of sociopolitical events is crucial for achieving diverse understandings of such events. To date, most studies in this area have focused on analyzing variation in texts describing events. Here, we analyze variation in image usage across Wikipedia language editions to understand if, like text, visual narratives reflect distinct perspectives in articles about culturally-tethered events. We focus on articles about coup d’états as an example of highly contextual sociopolitical events likely to display such variation. The key challenge to examining variation in images is that there is no existing framework to use as a basis for comparison. To address this challenge, we use an iterative inductive coding process to arrive at a 46-item typology for categorizing the content of images relating to contested sociopolitical events, and a typology of network motifs that characterizes structural patterns of image use. We apply these typologies in a large-scale quantitative analysis that establishes clusters of image themes, two detailed qualitative case studies comparing Wikipedia articles on coup d’états in Soviet Russia and Egypt, and four quantitative analyses clustering image themes by language usage at the article level. These analyses document variation in imagery around particular events and variation in tendencies across cultures. We find substantial cultural variation in both content and network structure. This study presents a novel methodological framework for uncovering culturally divergent perspective of political crises through imagery on Wikipedia.