Skip to content Skip to sidebar Skip to footer

How Do I Use Pandas Groupby Function To Apply A Formula Based On The Groupby Value

My question may be a little confusing, so let me explain. I have a dataframe of information that I would like to group by the unique order id that will produce the following column

Solution 1:

Writing a named funtion and using apply works:

def func(group):
    sum_ = group.qty.sum()
    es = (group.csv / group.qty).sum()
    return pd.Series([sum_, es], index=['qty', 'es'])

trades.groupby('ordrefno').apply(func)

Result:

            qty     es
ordrefno               
983375   -10000 -0.0015
984702      100  0.0003
984842   -25100 -0.0008

Solution 2:

Assuming you want the ratio of the sums rather than the sum of the ratios (the way the question is worded suggest this but the function in you code would give the sum of the ratios if applied to the df), I think the cleanest way to do this is in two steps. First just get the sum of the two columns and then divide:

agg_td = trades.groupby('ordrefno')[['qty', 'csv']].sum()
agg_td.eval('es = csv/qty')

You could also create a special function and pass it to the groupby apply method:

es = trades.groupby('ordrefno').apply(lambda df: df.csv.sum() / df.qty.sum()) 

But this will only get you the 'es' column. The problem with using agg is that the dict of functions are column-specific where here you need to combine two columns.

Post a Comment for "How Do I Use Pandas Groupby Function To Apply A Formula Based On The Groupby Value"