Data visualisation: what makes a good image?

Data visualisation is one of the fastest growing fields of development in the area of computer science, and for good reason: it’s a powerful tool for making sense of information, both in an explanatory and exploratory way. The rise of static and interactive infographics, for instance, is the direct result of a field striving to convey data in a way that is easy to grasp for the average person, and to give insight into complex matters using visual principles that take into account the way our brains process information. The science behind meaningful data visualisation lies in behavioural psychology and neuroscience.

One great example of explanatory data visualisation is Al Jazeera’s interactive map of the opposition forces behind the crisis in Syria. The map shows information about all rebel opposition groups and gives description and further details about the areas they operate in, the number of members they have, and which bigger groups they are affiliated with. Due to the complexity of this situation, it’s very difficult to understand by merely looking at numbers. But placing it on a map makes it easier to grasp, and sends a much more powerful message.

Understanding what your readers already know is the first crucial step in creating a good data visualisation. For instance, when looking at a heat map, one can instantly tell that red means bad, and green means good, and the intensity of a colour can also give more nuance about the range of the data set we are looking at. The map below plots levels of inequality in the world by looking at the Gini Index of each country. Intuitively, the green areas have a higher equality index than the red areas, and different intensities of each colour are an extra way to differentiate regions. This approach is incredibly easy to understand for people, as it instantly conveys information about the whole map in one go. From there, one can easily infer more, in this case the fact that generally green areas seem to group together geographically, and so do red areas. This sort of information could not be easily understood by simply looking at a table with numbers, as there would be no visual hints of which regions are geographically close to each other.

 

 

Screen Shot 2016-04-23 at 21.10.54
Gini Index by country. Source: wikipedia.org.

To understand how data can be meaningfully used for visualisations, it’s worth looking at how visual processing happens. Even before we have time to focus on an image, the brain takes in information about it – this is called pre-attentive visual processing. When viewing the map above, our brains process the colours, shades, sizes, and shapes even before we pay attention to what we are looking at. And we immediately assume that we are looking at a map that is trying to show information that is “positive” in some areas, and “negative” in others, because we learned the meaning of “red” and “green” at a young age. Once we begin paying attention to the content of the image, our brains start making different connections, seeing patterns, and discovering detail.

Pre-attentive visual processing relies on visual stimuli that we have learned so well that we recognise without any conscious intention. Behind the curtains, the process of learning happens because of how “strong” or “weak” different pathways between neurons are. That is, every time you make coffee, the bonds between the neurons taking care of that process grow stronger. Eventually, if the pathway becomes strong enough, that process becomes a reflex – this is why you don’t have to think through all the steps required to make your morning coffee, it just happens.

As we grow up, we are systematically taught what different signals mean – red and green are used everywhere to signify good and bad, respectively. The semaphore tells us that it’s OK to go when it’s green, and, if it’s red, we should stop. Teachers mark the mistakes on our papers in red, and hazard signs are almost always red as well. All of these situations incrementally entrench the ideas in our memory by making the pathways that process this information stronger and stronger. The same process occurs with shades of the same colour signifying the intensity of a certain piece of information, or how a tick (√) signifies “right”, and a cross (x) signifies “wrong”.

Meaningful data visualisation comes about when those designing it understand the different expectations that our brains have and take advantage of them in order to convey more information in fewer words or numbers. At its best, data visualisation can help advance research in numerous fields. For instance, the UCL Genetics Institute is using powerful visualisation techniques to explore the question of “nature versus nurture”, i.e., looking at how much of our behaviour is determined by the environment we grow up in. According to the head of the institute, Dr Oliver Davis, one of the things that they have noticed in exploring the data is that “while in the rest of the UK, genetic differences between people were more important in explaining variation in classroom behaviour problems, London was an ‘environmental hotspot’ for this trait, with variation being largely explained by environmental differences”.

On the other hand, there are crucial negative effects that can arise if visualisation designers want to propagate their own agendas. Knowing how to manipulate data has to come with some ethical responsibility to not skew it. Take, for example, the infographic below about the US healthcare system. It received criticism because people believed the designers were trying to make people think that the system is unnecessarily complex, rather than explain how it works. Notice how hard it is to find “patients” and “physicians” in the image, while people would expect those to be crucial aspects of a healthcare system. This is a problematic political bias that should not come into play when presenting data, as it can lead people to form specific opinions rather than letting them decide what they make of it. As such, when creating visual data, it is important to keep objectivity as a top priority.

health_viz
A representation of the US Healthcare system. (source: visual.ly)

The questions that data visualisation can answer are countless, as it makes it possible finding correlations that could not have been seen otherwise. It is currently giving insight into human nature, the complexity of a humanitarian crisis, crime rates in cities and many other issues. In the future, it could be used as a tool for education, for research into treatment patterns on patients on specific diseases, or for strategic planning of humanitarian aid using maps like Al Jazeera’s. As long as there is data, and an understanding of what makes a visual powerful, the possibilities are endless.

Books