Include Missing Group Keys As Nan In Pandas Groupby Output
I have a dataframe in pandas. test_df = pd.DataFrame({'date': ['2018-12-28', '2018-12-28', '2018-12-29', '2018-12-29', '2018-12-30', '2018-12-30'], 'transact
Solution 1:
This is easy if you convert "transaction" to a categorical column before grouping,
df.transaction = pd.Categorical(df.transaction)
df.groupby(['date','transaction','ccy']).sum().unstack(2)
amt
ccy EUR USD
date transaction
2018-12-28 aa NaN0.404488
bb 0.459295NaN
cc NaNNaN2018-12-29 aa NaN0.439354
bb NaNNaN
cc 0.429269NaN2018-12-30 aa NaNNaN
bb NaN1.542451
cc NaNNaNMissing categories in the output are represented by NaNs. This is usually possible when performing numeric aggregation.
If you don't want to modify df, this will do:
u = pd.Series(pd.Categorical(df.transaction), name='transaction')
df.groupby(['date', u,'ccy']).sum().unstack(2)
amt
ccy EUR USD
date transaction
2018-12-28 aa NaN0.429134
bb 0.852355NaN
cc NaNNaN2018-12-29 aa NaN0.541576
bb NaNNaN
cc 0.994095NaN2018-12-30 aa NaNNaN
bb NaN0.744587
cc NaNNaN
Post a Comment for "Include Missing Group Keys As Nan In Pandas Groupby Output"