Concatenate Rows With Same Column Value In A Single Pandas Dataframe
I have a pandas dataframe like so : id code mean count 1 A 32 22 1 B 9 56 1 C 25 78 2 A 33 35 2 B 11 66 Basically, for every ID there
Solution 1:
Use GroupBy.cumcount
for counter, then reshape by DataFrame.set_index
and DataFrame.unstack
, sorting second level of MultiIndex
by DataFrame.sort_index
and last flatten MultiIndex by join
:
df = pd.DataFrame({'id': [1, 1, 1, 2, 2],
'code': ['A', 'B', 'C', 'A', 'B'],
'mean': [32, 9, 25, 33, 11],
'count': [22, 56, 78, 35, 66]})
print (df)
id code mean count
0 1 A 32 22
1 1 B 9 56
2 1 C 25 78
3 2 A 33 35
4 2 B 11 66
print (df.columns)
Index(['id', 'code', 'mean', 'count'], dtype='object')
print (df.columns.tolist())
['id', 'code', 'mean', 'count']
df['g'] = df.groupby('id').cumcount().add(1)
df = (df.set_index(['id','g'])
.unstack(fill_value=-1)
.sort_index(level=1, axis=1))
df.columns = df.columns.map(lambda x: f'{x[0]}{x[1]}')
For convert id
to column use reset_index
:
df = df.reset_index()
print (df)
id code1 count1 mean1 code2 count2 mean2 code3 count3 mean3
0 1 A 22 32 B 56 9 C 78 25
1 2 A 35 33 B 66 11 -1 -1 -1
df = df.reset_index()
Solution 2:
I think this will work for you. If you want to add numbers for the suffix just add a counter.
final=pd.DataFrame()
for i in df['code'].unique():
final=pd.concat([final,df.query(f'code=="{i}"').set_index('id').add_suffix(f"_{i}")],axis=1).fillna(-1)
code_A mean_A count_A code_B mean_B count_B code_C mean_C count_C
id
1 A 32 22 B 9 56 C 25.0 78.0
2 A 33 35 B 11 66 -1 -1.0 -1.0
Post a Comment for "Concatenate Rows With Same Column Value In A Single Pandas Dataframe"