Skip to content Skip to sidebar Skip to footer

Add Extra Column As The Cumulative Time Difference

How to add an extra column that is the cumulative value of the time differences for each course? For example, the initial table is: id_A course weight ts_

Solution 1:

You can chain the diff method with cumsum:

# convert ts_A to datetime type
df.ts_A = pd.to_datetime(df.ts_A)

# convert ts_A to seconds, group by id and then use transform to calculate the cumulative differencedf['cum_delta_sec'] = df.ts_A.astype(int).div(10**9).groupby(df.id_A).transform(lambda x: x.diff().fillna(0).cumsum())
df

enter image description here

Solution 2:

Use groupby, transform, and .iloc:

df['ts_A'] = pd.to_datetime(df.ts_A)
df['cum_delta_sec'] = (df.groupby('id_A')['ts_A']
                         .transform(lambda x: (x - x.iloc[0]).dt.total_seconds()))

Output:

id_Acourseweightts_Avaluecum_delta_sec0id1cotton3.52017-04-27 01:35:30  150.00000001id1cotton3.52017-04-27 01:36:00  416.666667302id1cotton3.52017-04-27 01:36:30  700.000000603id1cotton3.52017-04-27 01:37:00  950.000000904id2cottonblue5.02017-04-27 02:35:30  150.00000005id2cottonblue5.02017-04-27 02:36:00  450.000000306id2cottonblue5.02017-04-27 02:36:30  520.666667607id2cottonblue5.02017-04-27 02:37:00  610.00000090

In the group, subtract current value from the first value and use .dt accessor to convert to seconds.

Solution 3:

import csv
import datetime as dt

with open('path/to/input') as fin, open('path/to/output', 'w') as fout:
    infile = csv.DictReader(fin, delimiter='\t')
    outfile = csv.DictWriter(fout, delimiter='\t', fieldnames=infile.fieldnames + ['cum_delta_sec'])

    cdt = 0last = None
    for row in infile:
        iflast is None:
            last = dt.strptime(row['ts_A'], "%Y-%m-%d %H:%M:%S")
            row['cum_delta_sec'] = 0
            outfile.writerow(row)
            continue

        cdt += (last - dt.strptime(row['ts_A'], "%Y-%m-%d %H:%M:%S")).total_seconds()
        row['cum_delta_sec'] = cdt
        outfile.writerow(row)

Post a Comment for "Add Extra Column As The Cumulative Time Difference"