Skip to content Skip to sidebar Skip to footer

Pandas Filter List By Using Unique Python

I have a dataframe similar to below df = pd.DataFrame.from_dict({'cat1':['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D'], 'cat2':[['X','Y'], ['F'], ['X','Y'], ['Y'], ['Y'],

Solution 1:

In python lists are not hashtable, so necessary convert them to tuples or strings, then use GroupBy.transform with SeriesGroupBy.nunique and filter by not equal with Series.ne and boolean indexing:

df = df[df['cat2'].apply(tuple).groupby(df['cat1']).transform('nunique').ne(1)]
#alternative
#df = df[df['cat2'].astype('str').groupby(df['cat1']).transform('nunique').ne(1)]
print (df)
  cat1    cat2
0    A  [X, Y]
1    A     [F]
2    A  [X, Y]
5    C     [Y]
6    C     [Z]
7    C  [P, W]

Post a Comment for "Pandas Filter List By Using Unique Python"