Groupby And Append Lists And Strings
I am trying to group-by the values in my 'value_1' column. But my last column is made up of lists. When I try to group-by using my 'value_1' column, the column made up of lists dis
Solution 1:
You could groupbyvalue_1 and aggregate the columns containing strings with the following function:
defstr_cat(x):
    return x.str.cat(sep=', ')
And use GroupBy.sum to append the lists in the column list:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': str_cat,
                                            'value_3': str_cat})
                        list                       value_2  \
value_1                                                              
american  [supermarket, connivence, state]  california, nyc, texas   
canadian             [coffee, sipermarket]          toronto, texas   
                    value_3  
value_1                                 
american  walmart, kmart, dunkinDonuts  
canadian         dunkinDonuts, walmart  
Solution 2:
Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2 
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
     value_1                 value_2                value_3  \
0   american  california, nyc, texas         walmart, kmart   
1   canadian                 toronto  dunkinDonuts, walmart   
                               list0  [supermarket, connivence, state]  
1             [coffee, supermarket]  
Explanation:
f1 and f2 are lambda functions.
First remove missing values (if exist) and join strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaNs) and join strings with separator:
f1 = lambda x: ', '.join([y for y in x ifisinstance(y, str)])
First get all string values with filtering empty strings and join strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != '']) 
Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
Post a Comment for "Groupby And Append Lists And Strings"