Skip to content Skip to sidebar Skip to footer

How To Create A New Column In Dataframe, Which Will Be A Function Of Other Columns And Conditionals Without Iteratng Over The Rows With A For Loop?

I have a relatively large data frame (8737 rows and 16 columns of all variable types, strings, integers, booleans etc.) and I want to create a new column based on an equation and s

Solution 1:

You can do that really easily in two steps:

df.loc[1:, 'S'] = df.loc[1:, "D"] * 0.5 * df.loc[1:, "C"].abs()  # Computes the numerical expression you want
df["S"] = df["S"].cumsum() # Add the previous to the current item of S# Then compute your `if` condition
df.loc[df["S"] < 5, 'S'] = 5
df.loc[df["S"] > 10, 'S'] = 10

==> no for loop.

Solution 2:

This (untested) & not sure what you want with values between 5 to 10

df_test['S'].iloc[0] = 5
df_test['S'] = df_test['S'].shift() + df_test['D'] * abs(df_test['C'])*0.5
df_test['S'] = np.where(df_test['S'] < 5, 5, df_test['S'])
df_test['S'] = np.where(df_test['S'] > 10, 10, df_test['S'])

Solution 3:

If your transformation would not have an if-condition it could be treated with scipy.signal.lfilter

At first we calculate the exogenous part

exo = 0.5 * df['D'].multiply(df['C'].abs())

After that we use lfilter

start = df['S'].iloc[0]
s = lfilter(np.array([1]), np.array([1, -1]), exo.shift(-1), zi=np.array([start]))[0]
df['S'].iloc[1:] = s[:-1]

On my computer this is about 70 times faster than the loop solution.

But sadly it won't help you because of the missing if-condition

Solution 4:

You can directly add-substract columns from others in pandas. e.g.

df['S'] = df.A + df.B - df.C + df.apply(abs)**2

If you want to change some value wrt to a condition use .loc Usage:

>>>df.loc[coniditon(row), (column value to be changed)] = value>>>df.loc[df.S < 5, 'S'] = 5>>>df.loc[df.S > 10, 'S'] = 10

And use cumulative summation function .cumsum() on row "S" to add values before to after.

df['S'] = df.S.cumsum()

Post a Comment for "How To Create A New Column In Dataframe, Which Will Be A Function Of Other Columns And Conditionals Without Iteratng Over The Rows With A For Loop?"