Skip to content Skip to sidebar Skip to footer

Extract Date From String Datetime Column In Pandas

I have a column cash_date in pandas dataframe which is a object. I am not able to use pandas to_datetime function here. Shape of my data frame is (47654566,5).My data frame looks l

Solution 1:

Specify a format=... argument.

pd.to_datetime(df['cash_date'],format='%d-%b-%y%H.%M.%S.%f%p',errors='coerce')02013-01-02 12:00:00.00012013-02-13 12:00:00.00022013-03-09 12:00:00.00032013-04-03 12:00:00.00042013-01-02 06:26:02.43852018-11-17 08:31:47.443Name:cash_date,dtype:datetime64[ns]

Details about acceptable formats may be found at http://strftime.org.

From here, you can floor the datetimes using dt.floor:

df['date'] = pd.to_datetime(
    df['cash_date'], format='%d-%b-%y %H.%M.%S.%f %p', errors='coerce'
).dt.floor('D')

df
                         cash_date  amount  iddate
0  02-JAN-13 12.00.00.000000000 AM     100   1 2013-01-02
1  13-FEB-13 12.00.00.000000000 AM     200   2 2013-02-13
2  09-MAR-13 12.00.00.000000000 AM     300   3 2013-03-09
3  03-APR-13 12.00.00.000000000 AM     400   4 2013-04-03
4  02-JAN-13 06.26.02.438000000 PM     500   7 2013-01-02
5  17-NOV-18 08.31.47.443000000 PM     700   8 2018-11-17

OTOH, if you are looking to extract the date component without parsing the date, there are a couple of options:

str.split

df['date'] = df['cash_date'].str.split(n=1).str[0]
df
                         cash_date  amount  iddate
0  02-JAN-13 12.00.00.000000000 AM     100   1  02-JAN-13
1  13-FEB-13 12.00.00.000000000 AM     200   2  13-FEB-13
2  09-MAR-13 12.00.00.000000000 AM     300   3  09-MAR-13
3  03-APR-13 12.00.00.000000000 AM     400   4  03-APR-13
4  02-JAN-13 06.26.02.438000000 PM     500   7  02-JAN-13
5  17-NOV-18 08.31.47.443000000 PM     700   8  17-NOV-18

Or, using a list comprehension.

df['date'] = [x.split(None, 1)[0] for x indf['cash_date']]
df
                         cash_date  amount  iddate
0  02-JAN-13 12.00.00.000000000 AM     100   1  02-JAN-13
1  13-FEB-13 12.00.00.000000000 AM     200   2  13-FEB-13
2  09-MAR-13 12.00.00.000000000 AM     300   3  09-MAR-13
3  03-APR-13 12.00.00.000000000 AM     400   4  03-APR-13
4  02-JAN-13 06.26.02.438000000 PM     500   7  02-JAN-13
5  17-NOV-18 08.31.47.443000000 PM     700   8  17-NOV-18

I will wager this is the faster of the two options.

Post a Comment for "Extract Date From String Datetime Column In Pandas"