Include Missing Group Keys As Nan In Pandas Groupby Output
I have a dataframe in pandas. test_df = pd.DataFrame({'date': ['2018-12-28', '2018-12-28', '2018-12-29', '2018-12-29', '2018-12-30', '2018-12-30'], 'transact
Solution 1:
This is easy if you convert "transaction" to a categorical column before grouping,
df.transaction = pd.Categorical(df.transaction)
df.groupby(['date','transaction','ccy']).sum().unstack(2)
amt
ccy EUR USD
date transaction
2018-12-28 aa NaN0.404488
bb 0.459295NaN
cc NaNNaN2018-12-29 aa NaN0.439354
bb NaNNaN
cc 0.429269NaN2018-12-30 aa NaNNaN
bb NaN1.542451
cc NaNNaN
Missing categories in the output are represented by NaNs. This is usually possible when performing numeric aggregation.
If you don't want to modify df
, this will do:
u = pd.Series(pd.Categorical(df.transaction), name='transaction')
df.groupby(['date', u,'ccy']).sum().unstack(2)
amt
ccy EUR USD
date transaction
2018-12-28 aa NaN0.429134
bb 0.852355NaN
cc NaNNaN2018-12-29 aa NaN0.541576
bb NaNNaN
cc 0.994095NaN2018-12-30 aa NaNNaN
bb NaN0.744587
cc NaNNaN
Post a Comment for "Include Missing Group Keys As Nan In Pandas Groupby Output"