Skip to content Skip to sidebar Skip to footer

Pandas Restacking Repeated Values To Columns

The below DataFrame needs to be restacked, so that I have all values for each region on one line. In the below example the new df would only have 3 lines, one for each region. The

Solution 1:

You could groupby on 'Area' and applylist:

In[75]:
df.groupby('Area')['value'].apply(list).reset_index()

Out[75]:
       Areavalue0AMERICAS[37, 24]1ASIA[51, 22]2EUROPE[47, 39]

This will handle a variable number of values

If you want to split the values out you can call apply and pass pd.Series ctor:

In [90]:
df1 = df.groupby('Area')['value'].apply(lambda x: list(x)).reset_index()
df1[['val1', 'val2']] = df1['value'].apply(pd.Series)
df1

Out[90]:
       Area     value  val1  val2
0  AMERICAS  [37, 24]    37241      ASIA  [51, 22]    51222    EUROPE  [47, 39]    4739

EDIT

For a variable number of columns you can't assign upfront if you don't know what the max number of values will be but you can still use the above:

In [94]:
import io
import pandas as pd

t="""index Area  value
0    EUROPE     47
1      ASIA     51
2  AMERICAS     37
3    EUROPE     39
4      ASIA     22
5  AMERICAS     24
5  AMERICAS     50"""
df = pd.read_csv(io.StringIO(t), sep='\s+')
df

Out[94]:
   index      Area  value
00    EUROPE     4711      ASIA     5122  AMERICAS     3733    EUROPE     3944      ASIA     2255  AMERICAS     2465  AMERICAS     50

In [99]:
df1 = df.groupby('Area')['value'].apply(list).reset_index()
df1

Out[99]:
       Area         value
0  AMERICAS  [37, 24, 50]
1      ASIA      [51, 22]
2    EUROPE      [47, 39]

In [102]:
df1 = pd.concat([df1, df1['value'].apply(pd.Series).fillna(0)], axis=1)
df1

Out[102]:
       Area         value   0120  AMERICAS  [37, 24, 50]  3724501      ASIA      [51, 22]  512202    EUROPE      [47, 39]  47390

Post a Comment for "Pandas Restacking Repeated Values To Columns"