Using Geopy In A Dataframe To Get Distances
I am new to Geopy. I am working in this transportation company and need to get the total kilometers that a truck has operated. I have seen some answers here but they did not work f
Solution 1:
Create a point
Series:
import pandas as pd
df = pd.DataFrame(
[
(-25.145439, -54.294871),
(-24.144564, -54.240094),
(-24.142564, -54.198901),
(-24.140093, 52.119021),
],
columns=['latitude', 'longitude']
)
from geopy import Point
from geopy.distance import distance
df['point'] = df.apply(lambda row: Point(latitude=row['latitude'], longitude=row['longitude']), axis=1)
In[2]: dfOut[2]:
latitudelongitudepoint0-25.145439-54.294871258m43.5804sS, 5417m41.5356sW1-24.144564-54.240094248m40.4304sS, 5414m24.3384sW2-24.142564-54.198901248m33.2304sS, 5411m56.0436sW3-24.14009352.119021248m24.3348sS, 527m8.4756sE
Add a new shifted point_next
Series:
df['point_next'] = df['point'].shift(1)
df.loc[df['point_next'].isna(), 'point_next'] = None
In[4]: dfOut[4]:
latitudelongitudepointpoint_next0-25.145439-54.294871258m43.5804sS, 5417m41.5356sWNone1-24.144564-54.240094248m40.4304sS, 5414m24.3384sW258m43.5804sS, 5417m41.5356sW2-24.142564-54.198901248m33.2304sS, 5411m56.0436sW248m40.4304sS, 5414m24.3384sW3-24.14009352.119021248m24.3348sS, 527m8.4756sE248m33.2304sS, 5411m56.0436sW
Calculate the distances:
df['distance_km'] = df.apply(lambda row: distance(row['point'], row['point_next']).km if row['point_next'] isnotNoneelsefloat('nan'), axis=1)
df = df.drop('point_next', axis=1)
In[6]: dfOut[6]:
latitudelongitudepointdistance_km0-25.145439-54.294871258m43.5804sS, 5417m41.5356sWNaN1-24.144564-54.240094248m40.4304sS, 5414m24.3384sW111.0031722-24.142564-54.198901248m33.2304sS, 5411m56.0436sW4.1926543-24.14009352.119021248m24.3348sS, 527m8.4756sE10449.661388
Solution 2:
Be ready that .apply(geopy.distance(), axis=1) will work really slow if you are working with big amount of data (hundreds of thousands).
One workaround there is using Haversine formula, which can be effectively vectorized within pandas/numpy frame (but maybe it is less precise). Other way is using something called geopandas, if youre Ok with external packages
Post a Comment for "Using Geopy In A Dataframe To Get Distances"