Skip to content Skip to sidebar Skip to footer

Cumulative Sum At Intervals

Consider this dataframe: dfgg Out[305]: Parts_needed output Year Month PartId 2018 1 L27849 72 72 2 L27849

Solution 1:

Basically you want a running total if the beginning was zero padded. You can do that with convolution. Here is a simple numpy example which you should be able to adapt to your pandas use case:

import numpy as np
a = np.array([10,20,3,4,5,6,7])
width = 4
kernel = np.ones(width)
np.convolve(a,kernel)

returning

array([10., 30., 33., 37., 32., 18., 22., 18., 13.,  7.])

As you can see this is a cumulative sum up until 37 in the output (or a[3]) and after that it's a sum of a rolling 4 element window.

This will work for you assuming you always have 24 rows for each 2 year period.

Here is a pandas example using only 2 months per year (so width is 4 instead of 24):

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({'year':[18,18,19,19,20,20,21,21],'month':[1,2,1,2,1,2,1,2],'parts':[230,5,2,12,66,32,1,2]})
>>> df
   month  parts  year
0      1    230    18
1      2      5    18
2      1      2    19
3      2     12    19
4      1     66    20
5      2     32    20
6      1      1    21
7      2      2    21
>>> width = 4
>>> kernel = np.ones(width)
>>> # Drop the last elements as you don't want the window to roll passed the end
>>> np.convolve(df['parts'],kernel)[:-width+1]
array([230., 235., 237., 249.,  85., 112., 111., 101.])

Now you just assign that last array to a new column of your DataFrame


Post a Comment for "Cumulative Sum At Intervals"