Skip to content Skip to sidebar Skip to footer

Transform Pandas Data Frame To Use For MultiLabelBinarizer

My question is: How can I transform a Data Frame like this to eventually use it in scikit's MulitLabelBinarizer: d1 = {'ID':[1,2,3,4], 'km':[80,90,90,100], 'weight':[10,20,20,30],

Solution 1:

I think you need this:

d1 = {'ID':[1,2,3,4], 'km':[80,90,90,100], 'weight':[10,20,20,30], 'label':['A','B','C','D']}
df1 = pd.DataFrame(data=d1)
#Groupby and get tuple, like you need 
df2 = pd.DataFrame(df1.groupby(['km','weight'])['label'].apply(lambda x: tuple(x.values)))
df2.reset_index(inplace=True)

from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
mlb.fit(df2['label'])
mlb.transform(df2['label'])

Post a Comment for "Transform Pandas Data Frame To Use For MultiLabelBinarizer"