Skip to content Skip to sidebar Skip to footer

How To Find The Complement Of Two Dataframes

given two large dataframes, is there any concise and efficient code (avoid using any for loop directly) that allow me to obtain the complement of these two dataframes? the most st

Solution 1:

Starting with this:

df1= pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                     'key2': ['K0', 'K1', 'K0', 'K1'],
                   'A': ['A0', 'A1', 'A2', 'A3'],
                    'B': ['B0', 'B1', 'B2', 'B3']})     
df2= pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                      'key2': ['K0', 'K0', 'K0', 'K0'],
                      'C': ['C0', 'C1', 'C2', 'C3'],
                      'D': ['D0', 'D1', 'D2', 'D3']})        
intersection  = pd.merge(df1, df2, how='inner',on=['key1', 'key2'])
union         = pd.merge(df1, df2, how='outer',on=['key1', 'key2'])       

print union

     A    B key1 key2    C    D
0   A0   B0   K0   K0   C0   D0
1   A1   B1   K0   K1  NaNNaN2   A2   B2   K1   K0   C1   D1
3   A2   B2   K1   K0   C2   D2
4   A3   B3   K2   K1  NaNNaN5NaNNaN   K2   K0   C3   D3

print intersection

AB key1 key2   C   D
0  A0  B0   K0   K0  C0  D0
1  A2  B2   K1   K0  C1  D1
2  A2  B2   K1   K0  C2  D2

union-intersection try this:

union[union.isnull().any(axis=1)]

     A    B key1 key2    C    D
1   A1   B1   K0   K1  NaNNaN4   A3   B3   K2   K1  NaNNaN5NaNNaN   K2   K0   C3   D3

Post a Comment for "How To Find The Complement Of Two Dataframes"