Thursday, December 16, 2010

Gnuplot - Stacked Histograms

Since gnuplot cannot generate pie charts an alternative are stacked histograms.
In fact stacked histograms are even better in my mind since one can put the histograms next to each other and this allows better comparability than looking at a number of pie charts.
A single pie chart might make sense but in reality it's more often about how does the current chart compare to a previous one.

Here I present an easy example how to generate stacked histograms (available in gnuplot since version 4.1).
For fancier examples go to the Gnuplot histogram demos.

Consider this example (call it stackedhisto.dat):
year foo bar rest
1900 20 10 20
2000 20 30 10
2100 20 10 10
We have 1 row with header information and 3 rows of data.
For each year we have measured 3 values foo, bar and rest which we want to show in graphs in two different ways.

The first graph shows the stacked histogram with the nominal values of the data i.e. the height of the first bar is 50 (=20+10+20).

The second graph shows the percentage distribution i.e. all values are scaled to 100.
The same nominal '20' in graph 1 leads to percentages 40, 33.3 and 50 in graph 2.
One box of this type of graph is often depicted as a pie chart so rather than comparing 3 pie charts (one for each year) here we have 3 boxes in one graph, much easier to compare.

The gnuplot code

#
# Stacked histograms
#
set term png size 300,300
set output 'stackedhisto.png'
set title "Stacked histogram\nTotals"

# Where to put the legend
# and what it should contain
set key invert reverse Left outside
set key autotitle columnheader

set yrange [0:100]
set ylabel "total"

# Define plot style 'stacked histogram'
# with additional settings
set style data histogram
set style histogram rowstacked
set style fill solid border -1
set boxwidth 0.75

# We are plotting columns 2, 3 and 4 as y-values,
# the x-ticks are coming from column 1
plot 'stackedhisto.dat' using 2:xtic(1) \
    ,'' using 3 \
    ,'' using 4


# New graph
# We keep the settings from above except:
set output 'stackedhisto1.png'
set title "Stacked histogram\n% totals"
set ylabel "% of total"

# We are plotting columns 2, 3 and 4 as y-values,
# the x-ticks are coming from column 1
# Additionally to the graph above we need to specify
# the titles via 't 2' aso.
plot 'stackedhisto.dat' using (100*$2/($2+$3+$4)):xtic(1) t 2\
    ,'' using (100*$3/($2+$3+$4)) t 3\
    ,'' using (100*$4/($2+$3+$4)) t 4

The generated graphs

1 comment: