Skip to content Skip to sidebar Skip to footer

How Can I Sum Two Different Columns At Once Where One Contains Decimal Objects In Pandas?

I have a dataframe that I want to aggregate the sums for for two different columns. Here is a df.head(5) of my original dataframe. price name quantity transaction_amo

Solution 1:

When I run

df.groupby(['pk', 'name']).sum()

I get

              price  quantity  transaction_amount
pk name                                          
48 Product 1    2.0         5                 5.0
63 Product 2    3.0         6                 6.0

Which indicates to me that your price and transaction_amount are objects.

Solution 2:

Since you are using decimal.Decimal objects, the numpy.sum won't handle your objects. So, simply defer to the built-in sum:

In [18]: df
Out[18]:
   pk price       name  quantity transaction_amount
0481.0  Product 111.01481.0  Product 144.02631.0  Product 222.03631.0  Product 233.04631.0  Product 211.0

In [19]: df.groupby(['pk', 'name']).aggregate({
    ...:     "quantity":np.sum,
    ...:     "price":sum,
    ...:     "transaction_amount":sum
    ...: })
Out[19]:
             price  quantity transaction_amount
pk name
48 Product 12.055.063 Product 23.066.0

Note, this will be slow, but it's the price you have to pay for using object dtype columns.

Solution 3:

You can specify the columns to sum like this.

df.groupby(['pk','name'])['quantity','transaction_amount'].sum()

Post a Comment for "How Can I Sum Two Different Columns At Once Where One Contains Decimal Objects In Pandas?"