Merging Data On Date Time Column (POSIXct Format)
I want to merge two data frames on Date Time column dtype.date-time columns contain both similar and different values. But I am unable to merge them such that all unique date-time
Solution 1:
merge(df_so2, df_met, by = "Date_Time", all = T)
Date_Time X.x POC Datum Date_GMT Sample.Measurement MDL X.y air_temp_set_1 dew_point_temperature_set_1
1 2015-01-01 1:00 NA NA <NA> <NA> NA NA 1 35.6 35.6
2 2015-01-01 2:00 NA NA <NA> <NA> NA NA 2 35.6 35.6
3 2015-01-01 3:00 1 2 WGS84 01/01/2015 09:00 2.3 0.2 3 35.6 35.6
4 2015-01-01 4:00 2 2 WGS84 01/01/2015 10:00 2.5 0.2 4 33.8 33.8
5 2015-01-01 5:00 3 2 WGS84 01/01/2015 11:00 2.1 0.2 5 33.2 33.2
6 2015-01-01 6:00 4 2 WGS84 01/01/2015 12:00 2.3 0.2 6 33.8 33.8
7 2015-01-01 7:00 5 2 WGS84 01/01/2015 13:00 1.1 0.2 7 33.8 33.8
Solution 2:
merge on outer should get them all:
pandas.DataFrame.merge
outer
: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.- based upon your comment, you want all the dates, not just those shown in
Expected Output
- add the
parameter
,sort=True
if you want them sorted bydate
df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')
X_x POC Datum Date_Time Date_GMT Sample.Measurement MDL X_y air_temp_set_1 dew_point_temperature_set_1
1.0 2.0 WGS84 2015-01-01 3:00 01/01/2015 09:00 2.3 0.2 3 35.6 35.6
2.0 2.0 WGS84 2015-01-01 4:00 01/01/2015 10:00 2.5 0.2 4 33.8 33.8
3.0 2.0 WGS84 2015-01-01 5:00 01/01/2015 11:00 2.1 0.2 5 33.2 33.2
4.0 2.0 WGS84 2015-01-01 6:00 01/01/2015 12:00 2.3 0.2 6 33.8 33.8
5.0 2.0 WGS84 2015-01-01 7:00 01/01/2015 13:00 1.1 0.2 7 33.8 33.8
NaN NaN NaN 2015-01-01 1:00 NaN NaN NaN 1 35.6 35.6
NaN NaN NaN 2015-01-01 2:00 NaN NaN NaN 2 35.6 35.6
without columns from df_met
:
df_exp.drop(columns=['X_y', 'air_temp_set_1', 'dew_point_temperature_set_1'], inplace=True)
df_exp.rename(columns={'X_x': 'X'}, inplace=True)
X POC Datum Date_Time Date_GMT Sample.Measurement MDL
1.0 2.0 WGS84 2015-01-01 3:00 01/01/2015 09:00 2.3 0.2
2.0 2.0 WGS84 2015-01-01 4:00 01/01/2015 10:00 2.5 0.2
3.0 2.0 WGS84 2015-01-01 5:00 01/01/2015 11:00 2.1 0.2
4.0 2.0 WGS84 2015-01-01 6:00 01/01/2015 12:00 2.3 0.2
5.0 2.0 WGS84 2015-01-01 7:00 01/01/2015 13:00 1.1 0.2
NaN NaN NaN 2015-01-01 1:00 NaN NaN NaN
NaN NaN NaN 2015-01-01 2:00 NaN NaN NaN
Solution 3:
df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')
I got:
POC Datum Date_Time Date_GMT Sample.Measurement MDL air_temp_set_1 dew_point_temperature_set_1 relative_humidity_set_1 wind_speed_set_1 cloud_layer_1_code_set_1 wind_direction_set_1 pressure_set_1d weather_cond_code_set_1 visibility_set_1 wind_cardinal_direction_set_1d weather_condition_set_1d
2 WGS84 2015-01-01 3:00 01/01/2015 09:00 2.3 0.2 35.6 35.6 100.0 0.0 14.0 0.0 29.943333 9.0 0.25 N Fog
1 WGS84 2015-01-01 3:00 01/01/2015 09:00 0.6 2.0 35.6 35.6 100.0 0.0 14.0 0.0 29.943333 9.0 0.25 N Fog
1 WGS84 2015-01-01 3:00 01/01/2015 12:00 7.4 0.2 35.6 35.6 100.0 0.0 14.0 0.0 29.943333 9.0 0.25 N Fog
1 WGS84 2015-01-01 3:00 01/01/2015 10:00 1.0 0.2 35.6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Notes:
- Check
df_met.info()
anddf_so2.info()
and verifyDate_Time
isnon-null datetime64[ns]
- If not, try the following:
df_so2.Date_Time = pd.to_datetime(df_so2.Date_Time)
df_met.Date_Time = pd.to_datetime(df_met.Date_Time)
Post a Comment for "Merging Data On Date Time Column (POSIXct Format)"