Python Pandas Running Totals With Resets

September 26, 2023 Post a Comment

I would like to perform the following task. Given a 2 columns (good and bad) I would like to replace any rows for the two columns with a running total. Here is an example of the

Solution 1:

The basic idea of my solution is to create a column from a cumsum over non-zero values in order to get the zero values with the next non zero value into one group. Then you can use groupby + sum to get your the desired values.

two_good = df.groupby((df['bad']!=0).cumsum().shift(1).fillna(0))['good'].sum()
two_bad = df.groupby((df['good']!=0).cumsum().shift(1).fillna(0))['bad'].sum()

two_good = two_good.loc[two_good!=0].reset_index(drop=True)
two_bad = two_bad.loc[two_bad!=0].reset_index(drop=True)

new_df = pd.concat([two_bad, two_good], axis=1).dropna()
print(new_df)
    bad  good
0     1  19.0
1     1  20.0
2     1  28.0
3     6  91.0
4     8  72.0
5    10  64.0
6     6  52.0
7     6  38.0
8    10  24.0
9     5  17.0
10    8  19.0
11    2  12.0
12    2   5.0
13    1   7.0
14    3   6.0
15    1   2.0

This code treats your etch case of trailing zeros different from your desired output, it simple cuts it off. You'd have to add some extra code to catch that one with a different logic.

Solution 2:

P.Tillmann. I appreciate your assistance with this. For the more advanced readers I would assume you to find this code appalling, as I do. I would be more than happy to take any recommendation which makes this more streamlined.

d={'AAA':range(0,20),
  'good':[3,3,13,20,28,32,59,72,64,52,38,24,17,19,12,5,7,6,2,0],
  'bad':[0,0,1,1,1,0,6,8,10,6,6,10,5,8,2,2,1,3,1,1]}
df=pd.DataFrame(data=d)
print(df)

row_good=0
row_bad=0
row_bad_zero_count=0
row_good_zero_count=0
row_out='NO'
crappy_fix=pd.DataFrame()
for index,row in df.iterrows():
    if row['good']==0or row['bad']==0:
        row_bad += row['bad']
        row_good += row['good']
        row_bad_zero_count += 1
        row_good_zero_count += 1
        output_ind='1'
        row_out='NO'elif index+1 < len(df) and (df.loc[index+1,'good']==0or    df.loc[index+1,'bad']==0):
        row_bad=row['bad']
        row_good=row['good'] 
        output_ind='2'
        row_out='NO'elif (row_bad_zero_count > 1or row_good_zero_count > 1) and row['good']!=0and row['bad']!=0:
        row_bad += row['bad']
        row_good += row['good']
        row_bad_zero_count=0
        row_good_zero_count=0    
        row_out='YES'
        output_ind='3'else:
        row_bad=row['bad']
        row_good=row['good']
        row_bad_zero_count=0
        row_good_zero_count=0
        row_out='YES'
        output_ind='4'if ((row['good']==0or row['bad']==0) 
        and (index > 0and (df.loc[index-1,'good']!=0or df.loc[index-1,'bad']!=0))
        and row_good != 0and row_bad != 0):
        row_out='YES'if row_out=='YES':
        temp_dict={'AAA':row['AAA'],
                   'good':row_good,
                   'bad':row_bad}
        crappy_fix=crappy_fix.append([temp_dict],ignore_index=True)
        print(str(row['AAA']),'-',
              str(row['good']),'-',
              str(row['bad']),'-',
              str(row_good),'-',
              str(row_bad),'-',
              str(row_good_zero_count),'-',
              str(row_bad_zero_count),'-',
              row_out,'-',
              output_ind) 

print(crappy_fix)

Python Playground

Python Pandas Running Totals With Resets

Solution 1:

Solution 2:

Post a Comment for "Python Pandas Running Totals With Resets"