Skip to content Skip to sidebar Skip to footer

Merging Data On Date Time Column (POSIXct Format)

I want to merge two data frames on Date Time column dtype.date-time columns contain both similar and different values. But I am unable to merge them such that all unique date-time

Solution 1:

merge(df_so2, df_met, by = "Date_Time", all = T)

        Date_Time X.x POC Datum         Date_GMT Sample.Measurement MDL X.y air_temp_set_1 dew_point_temperature_set_1
1 2015-01-01 1:00  NA  NA  <NA>             <NA>                 NA  NA   1           35.6                        35.6
2 2015-01-01 2:00  NA  NA  <NA>             <NA>                 NA  NA   2           35.6                        35.6
3 2015-01-01 3:00   1   2 WGS84 01/01/2015 09:00                2.3 0.2   3           35.6                        35.6
4 2015-01-01 4:00   2   2 WGS84 01/01/2015 10:00                2.5 0.2   4           33.8                        33.8
5 2015-01-01 5:00   3   2 WGS84 01/01/2015 11:00                2.1 0.2   5           33.2                        33.2
6 2015-01-01 6:00   4   2 WGS84 01/01/2015 12:00                2.3 0.2   6           33.8                        33.8
7 2015-01-01 7:00   5   2 WGS84 01/01/2015 13:00                1.1 0.2   7           33.8                        33.8

Solution 2:

merge on outer should get them all:

  • pandas.DataFrame.merge
  • outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
  • based upon your comment, you want all the dates, not just those shown in Expected Output
  • add the parameter, sort=True if you want them sorted by date
df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')

 X_x  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL  X_y  air_temp_set_1  dew_point_temperature_set_1
 1.0  2.0  WGS84  2015-01-01 3:00  01/01/2015 09:00                 2.3  0.2    3            35.6                         35.6
 2.0  2.0  WGS84  2015-01-01 4:00  01/01/2015 10:00                 2.5  0.2    4            33.8                         33.8
 3.0  2.0  WGS84  2015-01-01 5:00  01/01/2015 11:00                 2.1  0.2    5            33.2                         33.2
 4.0  2.0  WGS84  2015-01-01 6:00  01/01/2015 12:00                 2.3  0.2    6            33.8                         33.8
 5.0  2.0  WGS84  2015-01-01 7:00  01/01/2015 13:00                 1.1  0.2    7            33.8                         33.8
 NaN  NaN    NaN  2015-01-01 1:00               NaN                 NaN  NaN    1            35.6                         35.6
 NaN  NaN    NaN  2015-01-01 2:00               NaN                 NaN  NaN    2            35.6                         35.6

without columns from df_met:

df_exp.drop(columns=['X_y', 'air_temp_set_1', 'dew_point_temperature_set_1'], inplace=True)
df_exp.rename(columns={'X_x': 'X'}, inplace=True)

   X  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL
 1.0  2.0  WGS84  2015-01-01 3:00  01/01/2015 09:00                 2.3  0.2
 2.0  2.0  WGS84  2015-01-01 4:00  01/01/2015 10:00                 2.5  0.2
 3.0  2.0  WGS84  2015-01-01 5:00  01/01/2015 11:00                 2.1  0.2
 4.0  2.0  WGS84  2015-01-01 6:00  01/01/2015 12:00                 2.3  0.2
 5.0  2.0  WGS84  2015-01-01 7:00  01/01/2015 13:00                 1.1  0.2
 NaN  NaN    NaN  2015-01-01 1:00               NaN                 NaN  NaN
 NaN  NaN    NaN  2015-01-01 2:00               NaN                 NaN  NaN

Solution 3:

df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')

I got:

 POC   Datum        Date_Time           Date_GMT   Sample.Measurement   MDL   air_temp_set_1   dew_point_temperature_set_1   relative_humidity_set_1   wind_speed_set_1   cloud_layer_1_code_set_1   wind_direction_set_1   pressure_set_1d   weather_cond_code_set_1   visibility_set_1  wind_cardinal_direction_set_1d  weather_condition_set_1d
    2  WGS84   2015-01-01 3:00  01/01/2015 09:00                   2.3   0.2             35.6                          35.6                     100.0                0.0                       14.0                    0.0         29.943333                       9.0               0.25                              N                       Fog
    1  WGS84   2015-01-01 3:00  01/01/2015 09:00                   0.6   2.0             35.6                          35.6                     100.0                0.0                       14.0                    0.0         29.943333                       9.0               0.25                              N                       Fog
    1  WGS84   2015-01-01 3:00  01/01/2015 12:00                   7.4   0.2             35.6                          35.6                     100.0                0.0                       14.0                    0.0         29.943333                       9.0               0.25                              N                       Fog
    1  WGS84   2015-01-01 3:00  01/01/2015 10:00                   1.0   0.2             35.6                           NaN                       NaN                NaN                        NaN                    NaN               NaN                       NaN                NaN                             NaN                      NaN

Notes:

  • Check df_met.info() and df_so2.info() and verify Date_Time is non-null datetime64[ns]
  • If not, try the following:
  • df_so2.Date_Time = pd.to_datetime(df_so2.Date_Time)
  • df_met.Date_Time = pd.to_datetime(df_met.Date_Time)

Post a Comment for "Merging Data On Date Time Column (POSIXct Format)"