Data visualization : bar plot multiple columns against grouped data of Pandas dataframe

Spread the word!

We define our initial dataframe and import all necessary libraries with the following sample code:

First 10 elements of our dataframe:

Now we group our data using Col1 values as group label, selecting only columns Col4 and Col5 and taking the mean value of all values belonging to the same group:

Resulting dataframe:

If we want to plot it using bar representations we should also consider the different scales. Indeed, this would be the plot of the previous dataframe:

The solution I’m going to introduce is quite simple and consist of a data transformation by expressing the values as ratio between the current class value and the sum of all other class values (of the same column):

Resulting in:

whereby all values of the same column along all classes add up to 1. It measures the contribution of each class to the total and can be used to compare classes according to any column.

We plot the dataframe again using the same code as before and obtain:



Be the first to comment

Leave a Reply

Your email address will not be published.