How To Find The Complement Of Two Dataframes
given two large dataframes, is there any concise and efficient code (avoid using any for loop directly) that allow me to obtain the complement of these two dataframes? the most st
Solution 1:
Starting with this:
df1= pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
'key2': ['K0', 'K1', 'K0', 'K1'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})
df2= pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
'key2': ['K0', 'K0', 'K0', 'K0'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})
intersection = pd.merge(df1, df2, how='inner',on=['key1', 'key2'])
union = pd.merge(df1, df2, how='outer',on=['key1', 'key2'])
print union
A B key1 key2 C D
0 A0 B0 K0 K0 C0 D0
1 A1 B1 K0 K1 NaNNaN2 A2 B2 K1 K0 C1 D1
3 A2 B2 K1 K0 C2 D2
4 A3 B3 K2 K1 NaNNaN5NaNNaN K2 K0 C3 D3
print intersection
AB key1 key2 C D
0 A0 B0 K0 K0 C0 D0
1 A2 B2 K1 K0 C1 D1
2 A2 B2 K1 K0 C2 D2
union-intersection try this:
union[union.isnull().any(axis=1)]
A B key1 key2 C D
1 A1 B1 K0 K1 NaNNaN4 A3 B3 K2 K1 NaNNaN5NaNNaN K2 K0 C3 D3
Post a Comment for "How To Find The Complement Of Two Dataframes"