Condensing Pandas Dataframe By Dropping Missing Elements
Problem I have a dataframe that looks like this: Key Var ID_1 Var_1 ID_2 Var_2 ID_3 Var_3 1 True 1.0 True NaN NaN 5.0 True 2 True NaN NaN 4.0 False 7.
Solution 1:
There is one simple solution i.e push the nans to right and drop the nans on axis 1. i.e
ndf = data.apply(lambda x : sorted(x,key=pd.isnull),1).dropna(1)
Output:
Key Var ID_1 Var_1 ID_2 Var_2 0 1 True 1 True 5 True 1 2 True 4 False 7 True 2 3 False 2 False 5 True
Hope it helps.
A numpy solution from Divakar here for 10x speed i.e
def mask_app(a):
out = np.full(a.shape,np.nan,dtype=a.dtype)
mask = ~np.isnan(a.astype(float))
out[np.sort(mask,1)[:,::-1]] = a[mask]
return out
ndf = pd.DataFrame(mask_app(data.values),columns=data.columns).dropna(1)
Key Var ID_1 Var_1 ID_2 Var_2 0 1 True 1 True 5 True 1 2 True 4 False 7 True 2 3 False 2 False 5 True
Post a Comment for "Condensing Pandas Dataframe By Dropping Missing Elements"