Skip to content Skip to sidebar Skip to footer

Find Different Rows Between 2 Dataframes Of Different Size With Pandas

I have 2 dataframes df1 and df2 of different size. df1 = pd.DataFrame({'A':[np.nan, np.nan, np.nan, 'AAA','SSS','DDD'], 'B':[np.nan,np.nan,'ciao',np.nan,np.nan,np.nan]}) df2 = pd.D

Solution 1:

I believe need isin withboolean indexing :

Also omit NaNs rows by default chain new condition:

#changed df2 with no NaN in C column
df2 = pd.DataFrame({'C':[4, 5, 5, 'SSS','FFF','KKK','AAA'], 
                    'D':[np.nan,np.nan,np.nan,1,np.nan,np.nan,np.nan]})
print (df2)
     C    D
0    4  NaN
1    5  NaN
2    5  NaN
3  SSS  1.0
4  FFF  NaN
5  KKK  NaN
6  AAA  NaN

df = df1[~(df1['A'].isin(df2['C']) | (df1['A'].isnull()))]
print (df)
     A    B
5  DDD  NaN

If not necessary omit NaNs if not exist in C column:

df = df1[~df1['A'].isin(df2['C'])]
print (df)
     A     B
0NaNNaN1NaNNaN2NaN  ciao
5  DDD   NaN

If exist NaNs in both columns use second solution:

(input DataFrames are from question)

df = df1[~df1['A'].isin(df2['C'])]
print (df)
     A    B
5  DDD  NaN

Post a Comment for "Find Different Rows Between 2 Dataframes Of Different Size With Pandas"