Skip to content Skip to sidebar Skip to footer

Can I Show The Different Methods Of Death Penalties As Well As Predict Future Years

I would like to be able to predict the rise/fall in death penalties for this dataset below This is USA 1976 death penalty data found at: https://www.kaggle.com/usdpic/execution-dat

Solution 1:

It sounds like what you want is to reshape your data so that you have a time series for each "method", which you can then use in a predictive model. It's probably worth pointing out that the distribution of "Method" is really skewed (values are from 1999 onwards), so it will be very difficult/impossible to forecast most of them:

df['Method'].value_counts()

# Lethal Injection    923# Electrocution        17# Gas Chamber           1# Firing Squad          1

Here is a solution that will help you reshape your data to get time series data for each "Method" (I've added a bit more of an explanation at the end):

df['Date'] = pd.to_datetime(df['Date'])

df = df[df['Date'].dt.year >= 1999]

df = df.set_index('Date')

df2 = df.groupby('Method').resample('1M').agg('count')['Name'].to_frame()

df2 = df2.reset_index().pivot(index='Date',columns='Method',values='Name').fillna(0)

df2.plot()

enter image description here

We can check that the new shape of the data gives us the correct number of "Method" counts:

df2.sum()

# Method# Electrocution        17.0# Firing Squad          1.0# Gas Chamber           1.0# Lethal Injection    923.0

Explained

df['Date'] = pd.to_datetime(df['Date'])

# Filter out rows where date values where the year is less than 1999df = df[df['Date'].dt.year >= 1999]

# Set the index to be the datetimedf = df.set_index('Date')

# This bit gets interesting - we're grouping by each method and then resampling# within each group so that we get a row per month, where each month now has a# count of all the previous rows associated with that month. As the dataframe is# now filled with the same count value for each column, we arbitrarily take the # first one which is 'Name'# Note: you can change the resampling frequency to any time period you want, # I've just chosen month as it is granular enough to cover the whole period
 
df2 = df.groupby('Method').resample('1M').agg('count')['Name'].to_frame()

#                              Name# Method           Date            # Electrocution    1999-06-30     1#                  1999-07-31     1#                  1999-08-31     1#                  1999-09-30     0#                  1999-10-31     0# ...                           ...# Lethal Injection 2016-08-31     0#                  2016-09-30     0#                  2016-10-31     2#                  2016-11-30     1#                  2016-12-31     2

df2 = df2.reset_index().pivot(index='Date',columns='Method',values='Name').fillna(0)

# Method      Electrocution  Firing Squad  Gas Chamber  Lethal Injection# Date                                                                  # 1999-01-31            0.0           0.0          0.0              10.0# 1999-02-28            0.0           0.0          0.0              12.0# 1999-03-31            0.0           0.0          1.0               7.0# 1999-04-30            0.0           0.0          0.0              10.0# 1999-05-31            0.0           0.0          0.0               6.0# ...                   ...           ...          ...               ...# 2016-08-31            0.0           0.0          0.0               0.0# 2016-09-30            0.0           0.0          0.0               0.0# 2016-10-31            0.0           0.0          0.0               2.0# 2016-11-30            0.0           0.0          0.0               1.0# 2016-12-31            0.0           0.0          0.0               2.0

Post a Comment for "Can I Show The Different Methods Of Death Penalties As Well As Predict Future Years"