Transform Pandas Data Frame To Use For MultiLabelBinarizer
My question is: How can I transform a Data Frame like this to eventually use it in scikit's MulitLabelBinarizer: d1 = {'ID':[1,2,3,4], 'km':[80,90,90,100], 'weight':[10,20,20,30],
Solution 1:
I think you need this:
d1 = {'ID':[1,2,3,4], 'km':[80,90,90,100], 'weight':[10,20,20,30], 'label':['A','B','C','D']}
df1 = pd.DataFrame(data=d1)
#Groupby and get tuple, like you need
df2 = pd.DataFrame(df1.groupby(['km','weight'])['label'].apply(lambda x: tuple(x.values)))
df2.reset_index(inplace=True)
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
mlb.fit(df2['label'])
mlb.transform(df2['label'])
Post a Comment for "Transform Pandas Data Frame To Use For MultiLabelBinarizer"