Python Pandas, A Function Will Be Applied To The Combinations Of The Elements In One Row Based On A Condition On The Other Row
It seems like there are similar questions, but I couldn't find a proper answer. Let's say this is my dataframe which has different observations for a different brand of cars: df =
Solution 1:
UPDATE:
In [49]: x = pd.DataFrame(np.triu(squareform(pdist(df[['distance']], my_func))),
...: columns=df.Car.str.split('_').str[0],
...: index=df.Car.str.split('_').str[0]).replace(0, np.nan)
...:
In [50]: x[x.apply(lambda col: col.index != col.name)].max(1).max(level=0)
Out[50]:
Car
BMW 197.0
Fiat NaN
WW 221.0
dtype: float64
OLD answer:
IIUC you can do something like the following:
from scipy.spatial.distance import pdist, squareform
defmy_func(x,y):
return2*x + 3*y
x = pd.DataFrame(
squareform(pdist(df[['distance']], my_func)),
columns=df.Car.str.split('_').str[0],
index=df.Car.str.split('_').str[0])
it produced:
In[269]: xOut[269]:
CarBMWBMWBMWWWWWFiatFiatCarBMW0.095.086.092.0131.0119.0167.0BMW95.00.0116.0122.0161.0149.0197.0BMW86.0116.00.0116.0155.0143.0191.0WW92.0122.0116.00.0159.0147.0195.0WW131.0161.0155.0159.00.0173.0221.0Fiat119.0149.0143.0147.0173.00.0213.0Fiat167.0197.0191.0195.0221.0213.00.0
exluding the same brand:
In [270]:x.apply(lambdacol:col.index!=col.name)Out[270]:CarBMWBMWBMWWWWWFiatFiatCarBMWFalseFalseFalseTrueTrueTrueTrueBMWFalseFalseFalseTrueTrueTrueTrueBMWFalseFalseFalseTrueTrueTrueTrueWWTrueTrueTrueFalseFalseTrueTrueWWTrueTrueTrueFalseFalseTrueTrueFiatTrueTrueTrueTrueTrueFalseFalseFiatTrueTrueTrueTrueTrueFalseFalseIn [273]:x[x.apply(lambdacol:col.index!=col.name)]Out[273]:CarBMWBMWBMWWWWWFiatFiatCarBMWNaNNaNNaN92.0131.0119.0167.0BMWNaNNaNNaN122.0161.0149.0197.0BMWNaNNaNNaN116.0155.0143.0191.0WW92.0122.0116.0NaNNaN147.0195.0WW131.0161.0155.0NaNNaN173.0221.0Fiat119.0149.0143.0147.0173.0NaNNaNFiat167.0197.0191.0195.0221.0NaNNaN
selecting maximum per row:
In [271]: x[x.apply(lambda col: col.index != col.name)].max(1)
Out[271]:
Car
BMW 167.0
BMW 197.0
BMW 191.0
WW 195.0
WW 221.0
Fiat 173.0
Fiat 221.0
dtype: float64
max per brand:
In [276]: x[x.apply(lambda col: col.index != col.name)].max(1).max(level=0)
Out[276]:
Car
BMW 197.0
Fiat 221.0
WW 221.0
dtype: float64
Solution 2:
i, j = np.tril_indices(len(df), 1)
defmy_func(x,y):
z = 2 * x + 3 * y
return z
d = df.distance.values
c = df.Car.values
s = pd.Series(my_func(d[i], d[j]), [c[i], c[j]])
deftest_name(df):
name = df.index[0]
n1, n2 = map(lambda x: x.split('_')[0], name)
return n1 != n2
s.groupby(level=[0, 1]).filter(test_name).groupby(level=1).apply(list)
BMW_1 [78, 104, 96, 128]
BMW_2 [123, 149, 141, 173]
BMW_3 [114, 140, 132, 164]
Fiat_1 [173]
WW_1 [116, 138, 170]
WW_2 [177, 209]
dtype: object
Post a Comment for "Python Pandas, A Function Will Be Applied To The Combinations Of The Elements In One Row Based On A Condition On The Other Row"