Skip to content Skip to sidebar Skip to footer

Aggregate Groups In Python Pandas And Spit Out Percentage From A Certain Count

I am trying to figure out how to aggregate groups in Pandas data frame by creating a percentage and summation on the new columns. For example, in the following data frame, I have c

Solution 1:

You could use groupby/agg to perform the summing and counting:

result = df.groupby(['A']).agg({'C': lambda x: x.sum()/x.count(), 'D':'sum'})

import numpy as np
import pandas as pd

df = pd.DataFrame(
    {'A' : ['foo', 'foo', 'foo', 'foo',
            'bar', 'bar', 'bar', 'bar'],
     'B' : ['one', 'one', 'two', 'three',
            'two', 'two', 'one', 'three'],
     'C' : [1, np.NaN, 1, 2, np.NaN, 1, 1, 2], 
     'D' : [2, '', 1, 1, '', 2, 2, 1]})
df['D'].replace('', np.NaN, inplace=True)

result = df.groupby(['A']).agg({'C': lambda x: x.sum()/x.count(), 'D':'sum'})
print(result)

yields

            C  D
A               
bar  1.3333335
foo  1.3333334

Post a Comment for "Aggregate Groups In Python Pandas And Spit Out Percentage From A Certain Count"