Skip to content Skip to sidebar Skip to footer

Date Time Difference And Dataframe Filtering

I have Panda dataframe df of following structure, Start and End Time are string values. Start Time End Time 0 2007-07-24 22:00:00 2007-07-25 07:16:53 1

Solution 1:

Question 1 Use pd.to_datetime, and then subtract the columns.

for c in df.columns:
    df[c] = pd.to_datetime(df[c])

(df['End Time'] - df['Start Time']).dt.total_seconds() / 3600

0     9.281389
1     1.590000
2     0.735278
3     1.693889
4    14.733333
dtype: float64

Question 2 Just use a mask and filter:

v = (df['End Time'] - df['Start Time']).dt.total_seconds() /3600
df[v <1.5]

           StartTimeEndTime22007-07-2509:45:532007-07-2510:30:00

If I misunderstood, and you actually want to retain such rows, reverse the condition:

df[v>=1.5]StartTimeEndTime02007-07-24 22:00:00 2007-07-25 07:16:5312007-07-25 07:16:55 2007-07-25 08:52:1932007-07-25 12:32:00 2007-07-25 14:13:3842007-07-25 22:59:00 2007-07-26 13:43:00

Question 3 Again, use a mask and filter:

df[(1/3 <= v) & (v <= 2/3)]

Post a Comment for "Date Time Difference And Dataframe Filtering"