Pandas Filter List By Using Unique Python
I have a dataframe similar to below df = pd.DataFrame.from_dict({'cat1':['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D'], 'cat2':[['X','Y'], ['F'], ['X','Y'], ['Y'], ['Y'],
Solution 1:
In python list
s are not hashtable, so necessary convert them to tuple
s or string
s, then use GroupBy.transform
with SeriesGroupBy.nunique
and filter by not equal with Series.ne
and boolean indexing
:
df = df[df['cat2'].apply(tuple).groupby(df['cat1']).transform('nunique').ne(1)]
#alternative
#df = df[df['cat2'].astype('str').groupby(df['cat1']).transform('nunique').ne(1)]
print (df)
cat1 cat2
0 A [X, Y]
1 A [F]
2 A [X, Y]
5 C [Y]
6 C [Z]
7 C [P, W]
Post a Comment for "Pandas Filter List By Using Unique Python"