Python Pandas, A Function Will Be Applied To The Combinations Of The Elements In One Row Based On A Condition On The Other Row
It seems like there are similar questions, but I couldn't find a proper answer. Let's say this is my dataframe which has different observations for a different brand of cars: df =
Solution 1:
UPDATE:
In [49]: x = pd.DataFrame(np.triu(squareform(pdist(df[['distance']], my_func))),
    ...:                  columns=df.Car.str.split('_').str[0],
    ...:                  index=df.Car.str.split('_').str[0]).replace(0, np.nan)
    ...:
In [50]: x[x.apply(lambda col: col.index != col.name)].max(1).max(level=0)
Out[50]:
Car
BMW     197.0
Fiat      NaN
WW      221.0
dtype: float64
OLD answer:
IIUC you can do something like the following:
from scipy.spatial.distance import pdist, squareform
defmy_func(x,y):
    return2*x + 3*y
x = pd.DataFrame(
    squareform(pdist(df[['distance']], my_func)),
    columns=df.Car.str.split('_').str[0],
    index=df.Car.str.split('_').str[0])
it produced:
In[269]: xOut[269]:
CarBMWBMWBMWWWWWFiatFiatCarBMW0.095.086.092.0131.0119.0167.0BMW95.00.0116.0122.0161.0149.0197.0BMW86.0116.00.0116.0155.0143.0191.0WW92.0122.0116.00.0159.0147.0195.0WW131.0161.0155.0159.00.0173.0221.0Fiat119.0149.0143.0147.0173.00.0213.0Fiat167.0197.0191.0195.0221.0213.00.0exluding the same brand:
In [270]:x.apply(lambdacol:col.index!=col.name)Out[270]:CarBMWBMWBMWWWWWFiatFiatCarBMWFalseFalseFalseTrueTrueTrueTrueBMWFalseFalseFalseTrueTrueTrueTrueBMWFalseFalseFalseTrueTrueTrueTrueWWTrueTrueTrueFalseFalseTrueTrueWWTrueTrueTrueFalseFalseTrueTrueFiatTrueTrueTrueTrueTrueFalseFalseFiatTrueTrueTrueTrueTrueFalseFalseIn [273]:x[x.apply(lambdacol:col.index!=col.name)]Out[273]:CarBMWBMWBMWWWWWFiatFiatCarBMWNaNNaNNaN92.0131.0119.0167.0BMWNaNNaNNaN122.0161.0149.0197.0BMWNaNNaNNaN116.0155.0143.0191.0WW92.0122.0116.0NaNNaN147.0195.0WW131.0161.0155.0NaNNaN173.0221.0Fiat119.0149.0143.0147.0173.0NaNNaNFiat167.0197.0191.0195.0221.0NaNNaNselecting maximum per row:
In [271]: x[x.apply(lambda col: col.index != col.name)].max(1)
Out[271]:
Car
BMW     167.0
BMW     197.0
BMW     191.0
WW      195.0
WW      221.0
Fiat    173.0
Fiat    221.0
dtype: float64
max per brand:
In [276]: x[x.apply(lambda col: col.index != col.name)].max(1).max(level=0)
Out[276]:
Car
BMW     197.0
Fiat    221.0
WW      221.0
dtype: float64
Solution 2:
i, j = np.tril_indices(len(df), 1)
defmy_func(x,y):
    z = 2 * x + 3 * y
    return z
d = df.distance.values
c = df.Car.values
s = pd.Series(my_func(d[i], d[j]), [c[i], c[j]])
deftest_name(df):
    name = df.index[0]
    n1, n2 = map(lambda x: x.split('_')[0], name)
    return n1 != n2
s.groupby(level=[0, 1]).filter(test_name).groupby(level=1).apply(list)
BMW_1       [78, 104, 96, 128]
BMW_2     [123, 149, 141, 173]
BMW_3     [114, 140, 132, 164]
Fiat_1                   [173]
WW_1           [116, 138, 170]
WW_2                [177, 209]
dtype: object
Post a Comment for "Python Pandas, A Function Will Be Applied To The Combinations Of The Elements In One Row Based On A Condition On The Other Row"