Skip to content Skip to sidebar Skip to footer

Concatenate Rows With Same Column Value In A Single Pandas Dataframe

I have a pandas dataframe like so : id code mean count 1 A 32 22 1 B 9 56 1 C 25 78 2 A 33 35 2 B 11 66 Basically, for every ID there

Solution 1:

Use GroupBy.cumcount for counter, then reshape by DataFrame.set_index and DataFrame.unstack, sorting second level of MultiIndex by DataFrame.sort_index and last flatten MultiIndex by join:

df = pd.DataFrame({'id': [1, 1, 1, 2, 2], 
                   'code': ['A', 'B', 'C', 'A', 'B'],
                   'mean': [32, 9, 25, 33, 11], 
                   'count': [22, 56, 78, 35, 66]})

print (df)
   id code  mean  count
0   1    A    32     22
1   1    B     9     56
2   1    C    25     78
3   2    A    33     35
4   2    B    11     66

print (df.columns)
Index(['id', 'code', 'mean', 'count'], dtype='object')


print (df.columns.tolist())
['id', 'code', 'mean', 'count']

df['g'] = df.groupby('id').cumcount().add(1)
df = (df.set_index(['id','g'])
        .unstack(fill_value=-1)
        .sort_index(level=1, axis=1))

df.columns = df.columns.map(lambda x: f'{x[0]}{x[1]}')

For convert id to column use reset_index:

df = df.reset_index()
print (df)
   id code1  count1  mean1 code2  count2  mean2 code3  count3  mean3
0   1     A      22     32     B      56      9     C      78     25
1   2     A      35     33     B      66     11    -1      -1     -1
df = df.reset_index()

Solution 2:

I think this will work for you. If you want to add numbers for the suffix just add a counter.

final=pd.DataFrame()
for i in df['code'].unique():
    final=pd.concat([final,df.query(f'code=="{i}"').set_index('id').add_suffix(f"_{i}")],axis=1).fillna(-1)


   code_A  mean_A  count_A code_B  mean_B  count_B code_C  mean_C  count_C
id                                                                        
1       A      32       22      B       9       56      C    25.0     78.0
2       A      33       35      B      11       66     -1    -1.0     -1.0

Post a Comment for "Concatenate Rows With Same Column Value In A Single Pandas Dataframe"