Groupby Max Value And Return Corresponding Row In Pandas Dataframe
My dataframe consists of students, dates, and test scores. I want to find the max date for each student and return the corresponding row (ultimately, I am most interested in the st
Solution 1:
You can sort the data frame by Date and then use groupby.tail
to get the most recent record:
df.iloc[pd.to_datetime(df.Date, format='%m/%d/%y').argsort()].groupby('Student_id').tail(1)
#Student_id Date Score#2 Lia1 12/13/16 0.845#0 Tina1 1/17/17 0.950#3 John2 1/25/17 0.975
Or avoid sorting, use idxmax
(this works if you don't have duplicated index):
df.loc[pd.to_datetime(df.Date, format='%m/%d/%y').groupby(df.Student_id).idxmax()]
# Student_id Date Score#3 John2 1/25/17 0.975#2 Lia1 12/13/16 0.845#0 Tina1 1/17/17 0.950
Post a Comment for "Groupby Max Value And Return Corresponding Row In Pandas Dataframe"