An Alternative for Better Visualization

In this part, we will introduce you two different graphical tools that may help you to illustrate your story better based on your work.

Tree Map

Tree map is a chart showing how shares of some amount relate to the total. It is a member of a chart collection called part-to-whole where the most famous member of this family is pie chart.

The tree map is orginally developed by Ben Shneirderman at the University of Maryland, the treemap divides sections of a square or rectangles into groups to illustrate a hiearchy or part-to-whole relationship amongst a large number of categories. In other words, the treemap is a squarified version of a pie chart.

When your data is hiearchical and precise comparisons between categories is not important. Today, they’re often used generally for categorical data.

In tree map, the size of the rectangles are organized from largest to smallest and color is used to encode a second dimension.

Example: The tree map looked similar to the one below. The intent was to compare returns activity (measured by volume of dollars claimed and number of returns) for their customer base where % of returns is encoded by the size of the rectangles while % of dollars is encoded by color - both in differences in hue and intensity.

We can make a few general conclusions from this visual: three customers (A, B, and C) are roughly half of the return activity (visually estimated based on the size of the rectangles) and these same three are also pretty drastically different in terms of dollars claimed (judged by the intensity of color—although without a color legend it’s difficult to discern much more than that).

Note that this visual might work well in an exploratory phase of analysis, where the analyst is sorting through the data to find what might be interesting to stakeholders. However, it doesn’t work so well when communicating what’s interesting in this data.  

What if I asked you which are the top 10 customers by dollars claimed? Or what’s the exact difference between Customer A and B’s % of returns? We can see how easily a treemap breaks down when trying to make these comparisons. In this case, you can consider to use bar chart types as a remedy. (Ref:https://www.storytellingwithdata.com/blog/2018/6/5/an-alternative-to-treemaps)

In Tableau, you need at least one dimension (categorical variable) and one measure (numerical variable). After selecting these two variables, then you can draw your tree map by using Show me button.

Consider Forbes2000 data where the observation shows the ranking of the world’s biggest companies, measured by sales, profits, assets and market value in 2000.

Now, we would like to compare the market value of the companies where both colors and rectangular sizes are determined by the market value.

You can add more than one dimension or measure. For example, let’s divide this hierarchical structure with respect to country. In Treemap, the variables are dragged and dropped onto Marks window.

To improve your clarification about the countries, you can fill each rectangular with color for each country by dragging and dropping country variable onto color bar.

We can obtain number of several conclusions from this map. For example, almost half of the total market value belongs the companies in USA where the General Electric is the most valuable company in 2000. Of course, apple does not have iPhone at that year :) In other words, the total market value of the companies in 60 countries is almost equal to the total market value of the companies in the USA.

You can add fill your treemap with respect to other numerical variables, and divide more categories if you wish. Other customization steps like labels and titles etc. can be also applicable here.

Example 1

Please connect “co2.xlsx” data through Tableau connect panel. Then, visualize the latest co2 emission across the countries. (LOCATION column)

Slope Chart

In some cases, it may not be necessary to show all of the data in your time series. In this case, a slope chart is a useful alternative, which was introduced by Edward Tufte in 1983 at first time.

Slope chart is a simple way to analyze especially two or sometimes even more values in a given period of time. When analyzing data generated over a period, slope chart is the most convenient solution for representing the evolution of one or more dimensions that begins at point A and ends at point B.

Source: Dairo Weitz

The figure above shows the evolution in the male obesity rate for different Latin American countries during the period 1985–2016. Clearly, in countries with lower per capita income (Bolivia, Haiti) a similar slope to that of the countries with higher GDP (Argentina, Uruguay) can be observed.

Note that the slope chart can be styled in many different ways. We can use two colors to denote the increase and decrease. We can include or exclude labels for levels and changes. We can even adjust the thickness of the line to correspond to a third variable.

Now, consider the gapminder data which shows the life expectancy, GDP per capita, and population by country over years.

Now, we would like to see how life expectancy changes over time in European countries between 1952 and 2007. Before starting visualization, we can change the class of year variable from numeric to date to make the further process simpler.

Since we are interested in only European countries, we can apply filtering to our data by dragging and dropping continent variable onto filter window.

To create a standard slope chart, add year to column, then the measure to rows, which is life expectancy, and add category, which is country, to detail.

Since we would like to represent only two time points, apply another filtering on year variable.

Then, add country names and by dragging and dropping country variable onto label. Here is a typical slope chart.

However, we are not done yet.

Then, drag and drop this new variable onto left panel where your variables are listed.

As you guess, there are texts on both beginning and end of the lines. To arrange it, go to label window again, and select line ends, and unselect allow labels to overlap other marks since we have many countries here. If your category number is less, you should also select this option.

To add circles to our slope chart, we need dual axis in Tableau. So, first drag and drop your numerical variable onto row, then select dual axis.

Then, make sure that you synchronize your both axis.

Then, you have two marks on your left panel. You should select circle for your second mark window to add circles.

Then, remove your second axis.

Ups.. still something wrong.

If you face such a ranking problem, reverse your axis and remove showing header.

Then, the final result.

Exercise 2

Please connect co2_long.xlsx data and visualize top 10 countries with highest co2 emission in 2000 and 2017 based on their rankings. You should have a plot looking like this.

What Makes Your Visualization Looks Better

The style of any graph can be improved in many aspects such as position, text, shape, color, scale etc. Thereby, you can explain your story of your data better. Today, we are mainly discussing about the usage of color and text to produce a good plot.

Color

The choosing the right color for your visual is a challenging process, the following tips will help you in this process. Before we start, it will be better at least heard these following three words.

1.Hue ranges from 0° to 360° — that’s your typical color wheel

2.Saturation ranges from 0% (gray) to 100% (super duper colorful!!)

3.Brightness/Value ranges from 0% (black) to 100% (the actual color)

  • Use color when you should, not when you can: Most data should be in neutral colors like grey with bright colors reserved for directing attention to significant or atypical data points.

  • Broaden your understanding of colors: The color palette does not have only red, green, blue, yellow, orange or purple. There are more colors than that.

  • Utliize color to group related data points: Color can be used to group data points of similar value and to render the extent of this similarity using the following two color palettes; sequential and divergent.

    A sequential color palettes is composed of varying intensities of a single hue of color at uniform saturation. Variability in luminance of adjacent colors corresponds to the variation in data values that they are used to render.

    A divergent color palettes is made of two sequential color palettes (each of a different hue) stacked next to each other with an inflection point in the middle. These become helpful when visualizing data with variations in two different directions.

  • Don’t dance all over the color wheel: When you use color for displaying unrelated categories, you should select a few hues and their neighbors. You can use Adobe Color or Color Calculator for this purpose.

  • Categorical colors have few easily discernible bins: While the use of different colors can help distinguish between different data points, a chart should at most comprise of 6–8 distinct color categories for each of those to be readily distinguishable.

  • Use saturation and lightness to make your hues work: The change in saturation and brightness has a significant effect on your visuals. They can even lead to new color. So, you should change saturation and lightness of your color, then check whether it is better or not.

  • Use warm colors and blue: The data viz. designers love to use yellow/orange/red and blue since these warm colors and blue are super versatile for categories. So, when you are in doubt, use an orange/red with blue.

  • When using green, make it a yellow or blue one: The forest green, the green you are familiar with, are rarely used in well designed graphics. Instead, you can lighten and desaturate green enormously and make it a bit yellow or a bit blue. It is also helpful to colorblind people in terms of distinguishing the different categories.

  • Avoid pure colors: The visuals using pure colors look unnatural and unpleasent to readers eyes. You can improve them by toning down saturation and making your colors darker by avoid brightness.

  • Make your colors similarly colorful

  • Avoid too little and too much contrast with your background

  • Choose a background that’s desaturated enough: Making your background might be a good idea, but you should know that it causes the distraction from your data easily and limits the number of color that you can use. That’s why you should use desaturated colors if you have such an intention.

Text

Texts is may be the most underrated element in any data visualization, but it will make your visual better and nicer if they are used properly.

First of all, you should use text to show information where readers need it.

  • Label directly

  • Repeat the units your data is measured in

  • Remind people what they are looking at in tooltips

  • Move the axis ticks where they are needed

You can also use text object in the visual to improve the readability of your plot.

  • Use a font that’s easy to read: People feel most comfortable when they are reading sans-serif, regular ,sentence case, neither overly narrow nor wide, >12 px and black text.

  • Lead the eye with font sizes, styles, and colors: Design is all about deciding what readers should see first, second, third, and last. The biggest and boldest text with the highest contrast against the background should be reserved for the most important information. That text will be read first. Often, that’s the title of your visualization. Small, thin, and gray text should be reserved for less important information, like the description or source.

  • Don’t make your readers turn their heads

  • Choose a suitable number format : What’s true for your title and description is also true for numbers. Do not add unnecessary precision when showing numbers.

Consider that this bar plot represents the GDP of G20 countries based on the latest available data, but it does not look good, huh? Let’s improve it step by step.

In this redesign, our purpose is to compare the GDP of Turkey with G-20 average and EU average as well as other countries.

Connect “gdp.xlsx” dataset using connect panel of your Tableau. Then, create a bar plot using “LOCATION” and “Value” variables.

Sort them from highest to lowest, and don’t rotate it. Remember that you should not turn the reader’s head.

Make Turkey and the averages visible by highlighting. To highlight an element in Tableau, select the element, then right click and choose group.

Change your default colors to supply consistency in the color scale. Assign the most attractive color to the element that you would like to highlight.

Remove the axis names.

Add title based on the comfortability of the readers. We choose Arial Rounded MT Bold with black color and size 20 font since it is a member of sans serif family. You can see the list of fonts in this family from here.

Make country names more visible and add label showing the measure unit. To change the font of the axis text, select format, then fonts. You can see the options on the left panel.

Remember that we don’t want to add unnecessary precision when we text the numbers in the visuals. However, we see that the values 3 decimals. There are several ways to solve it, one of them is to copy the Value column and make the copied column integer.

When you add the values as label to your visuals, then the axis represents your values lose its meaning. In this case, removing the axis will improve the appearance of the visual.

Removing grid lines make your visual better in general. To remove grid line in Tableau, select format and then lines.

As a last step, we fill the background with a desaturated colour. To change your background, select format and then shading. You can fill the background from the worksheet menu under the sheet options from the opening window on the left panel.

Lastly, you can also add subtitle from the title menu and remove the axis lines from the line menu. (See grid line part.)

Here is the final view.

Exercise 3

Please connect gapminder.xlsx data. Consider 10 selected countries including Turkey and the corresponding life expectancy and gdp value. Design an appropriate plot to compare these 10 countries in terms of their life expectancy or gdp value. It is your choice!

Sources