Thursday, January 31, 2013

Pareto charts with Google charts

As a follow-up to my recent post about histograms in Google charts I thought it would be nice to have a Pareto chart too.

I often use Pareto charts only in the very simple sense of showing bars in descending order but if one wants to include the ascending percentage line there are two challenges here:

  • for a given data set one needs to calculate the accumulated percentages
  • the chart needs to show two graphs (columns and line) and two axes (left for bars, right for percentages)

    The first challenge needs to be coded, the second can be done with so called Combo Charts.

    Assume we have the following data represented as a 2-dimensional array in Javascript, not necessarily ordered since they will be reordered anyway. The column titles are kept in a separate array with an additional title for the percentages.

    var dataSet1 = [
        [ 'A', 25 ],
        [ 'B', 125 ],
        [ 'C', 35 ],
        [ 'D', 10 ],
        [ 'E', 12 ],
        [ 'F', 70 ],
        [ 'G', 60 ]
    var dataTitle = [ 'Category', 'Size', 'Pctg' ];

    The Pareto chart looks like this:

    How did I get there?

    There is a function which

  • first sorts the array according to column 2 in descending order (that is done by supplying the JavaScript built-in 'sort' with a sorting function of our own)
  • then calculates the total of column, calculates the percentages and puts the percentages into a new column in each row
  • prepends the data array with its title row

    function paretorize() {
      // Sort the dataSet array usung column 2
      dataSet1.sort( function(a,b) {
        return b[1] - a[1];
      // Calculate the total of column 2
      var sum = 0;
      for(row=0; row<dataSet1.length; row++) {
        sum += dataSet1[row][1];
      // Calculate the accumulating percentages
      // and add them into a new column in each row
      var accum = 0;
      for(row=0; row<dataSet1.length; row++) {
        dataSet1[row].push( accum+100*dataSet1[row][1]/sum );
        accum = dataSet1[row][2];
      // Add the title row at the beginning of dataSet
      // ('unshift' is not supported in IE8 and earlier)
      dataSet1.unshift( dataTitle );

    Now that the dataSet array has been constructed it can be fed to the Google charts and a few options need to be set to take care of the second graph and axis. This is achieved by making vAxes an array of two axes with their own titles (and other attributes if needed) and by setting seriesType and series to specify standard behaviour and the special setting for the line chart.

    function drawChart() {
      var data = google.visualization.arrayToDataTable( dataSet1 );
      var options = {
        title:  'Pareto chart',
        legend: { position: 'none' },         // no legend
        // Create two vertical axes taking its titles from the first row
           { title: dataSet1[0][1], minValue: 0 }, 
           { title: dataSet1[0][2], minValue: 0, maxValue: 100 }
        hAxis:  { title: dataSet1[0][0] },
        backgroundColor: {strokeWidth: 2 },   // to get a nice box
        seriesType: "bars",                   // the standard chart type
        // the second data column should be of type 'line' and should be associated with the second vertical axis
        series: {1: {type: "line", targetAxisIndex: 1 }},  
      // Note: this calls a ComboChart !!!
      var chart = new google.visualization.ComboChart(document.getElementById('chart_div'));
      chart.draw(data, options);

    The same comments apply about chart width, number of data rows, size of columns as in the histogram blog.

    Putting all together

    The code snippets above are part of a Javascript script and need to be put into an HTML page as follows in order to display the chart.
    <script type="text/javascript" src=""></script>
    <script type="text/javascript">
    google.load("visualization", "1", {packages:["corechart"]});
    ... put here all three code snippets from above ...
    <div id="chart_div" style="width: 400px; height: 300px;"></div>

    One thing to note here: if you create several charts on one HTML page each of them needs to have its own id, function and needs to be set separately like

    function drawChartA() {
      // ComboChart or any other chart type of course
      var chart = new google.visualization.ComboChart(document.getElementById('chart_divA'));
    function drawChartB() {
      var chart = new google.visualization.ComboChart(document.getElementById('chart_divB'));
    function drawChartC() {
      var chart = new google.visualization.ComboChart(document.getElementById('chart_divC'));
    and later place the chart wherever needed
    <div id="chart_divA" style="width: 400px; height: 300px;"></div>
    <div id="chart_divB" style="width: 400px; height: 300px;"></div>
    <div id="chart_divC" style="width: 400px; height: 300px;"></div>

    1. Very useful post. But can you please post the complete code as a html file. I tried and it is not working.

      1. Assuming that you are familiar with putting scripts into an HTML page I added the "script" wrappers in the 'Putting All Together' section, hope this helps.

    2. Worked for me, thanks, this is the only place that i find something about pareto chart with google charts.

    3. Harvard Business Review named data scientist the "sexiest job of the 21st century".This Data Science course will cover the whole data life cycle ranging from Data Acquisition and Data Storage using R-Hadoop concepts, Applying modelling through R programming using Machine learning algorithms and illustrate impeccable Data Visualization by leveraging on 'R' capabilities.With companies across industries striving to bring their research and analysis (R&A) departments up to speed, the demand for qualified data scientists is rising.

      data science training in bangalore

    4. myTectra offers Big Data and Hadoop training in Bangalore using Class Room.
      myTectra offers Live Online Big Data and Hadoop training Globally.
      Big Data and Hadoop training Unlike traditional systems, Big Data and Hadoop enables multiple types of analytic workloads to run on the same data, at the same time, at massive scale on industry-standard hardware.myTectra Big Data and Hadoop training is designed to help you become a expert Hadoop developer. myTectra offers Big Data Hadoop Training in Bangalore using Class Room. myTectra offers Live Online Big Data and Hadoop training Globally.

      hadoop training in bangalore

    5. Python has adopted as a language of choice for almost all the domain in IT including the most trending technologies such as Artificial Intelligence, Machine Learning, Data Science, Internet of Things (IoT), Cloud Computing technologies such as AWS, OpenStack, VMware, Google Cloud, etc.., Big Data Analytics, DevOps and Python is prepared language in traditional IT domain such as Web Application Development, Infrastructure Automation ,Software Testing, Mobile Testing.

      python online training