Skip to content Skip to sidebar Skip to footer

How To Create A Python Dataframe Containing The Mean Of Some Rows Of Another Dataframe

I have a pandas DataFrame containing some values: id pair value subdir taylor_1e3c_1s_56C taylor 6_13 -0.398716 run1 taylor_1e3c_1s_56C taylo

Solution 1:

Use syntactic sugar - groupby by Series and indices and aggregatemean:

df = df['value'].groupby([df.index, df['id'], df['pair']]).mean().reset_index(level=[1,2])
print (df)
                        id  pair     value
taylor_1e3c_1s_56C  taylor  6_13 -0.392351
taylor_1e3c_1s_56C  taylor  8_11 -0.391376

Classic solution - first reset_index for column from indices and then groupby by columns names and aggregatemean:

df = df.reset_index().groupby(['index','id','pair'])['value'].mean().reset_index(level=[1,2])
print (df)
                        id  pair     value
index                                     
taylor_1e3c_1s_56C  taylor  6_13 -0.392351
taylor_1e3c_1s_56C  taylor  8_11 -0.391376

Detail:

print (df.reset_index())
                index      id  pair     value subdir
0  taylor_1e3c_1s_56C  taylor  6_13 -0.398716   run1
1  taylor_1e3c_1s_56C  taylor  6_13 -0.397820   run2
2  taylor_1e3c_1s_56C  taylor  6_13 -0.397310   run3
3  taylor_1e3c_1s_56C  taylor  6_13 -0.390520   run4
4  taylor_1e3c_1s_56C  taylor  6_13 -0.377390   run5
5  taylor_1e3c_1s_56C  taylor  8_11 -0.393604   run1
6  taylor_1e3c_1s_56C  taylor  8_11 -0.392899   run2
7  taylor_1e3c_1s_56C  taylor  8_11 -0.392473   run3
8  taylor_1e3c_1s_56C  taylor  8_11 -0.389959   run4
9  taylor_1e3c_1s_56C  taylor  8_11 -0.387946   run5

After aggregate mean get MultiIndex with 3 levels:

print (df.reset_index().groupby(['index','id','pair'])['value'].mean())
index               id      pair
taylor_1e3c_1s_56C  taylor  6_13   -0.392351
                            8_11   -0.391376
Name: value, dtype: float64

So is necessesary reset_index for convert second ant third level to columns:

print (df.reset_index()
        .groupby(['index','id','pair'])['value']
        .mean()
        .reset_index(level=[1,2]))
                        id  pair     value
index                                     
taylor_1e3c_1s_56C  taylor  6_13 -0.392351
taylor_1e3c_1s_56C  taylor  8_11 -0.391376

Post a Comment for "How To Create A Python Dataframe Containing The Mean Of Some Rows Of Another Dataframe"