Skip to content Skip to sidebar Skip to footer

Assignment Of Pandas Dataframe With Float32 And Float64 Slow

Assignments with a Pandas DataFrame with varying float32 and float64 datatypes are for some combinations rather slow the way I do it. The code below sets up a DataFrame, makes a Nu

Solution 1:

Single-column assignments does not change type and iterating with a for-loop over columns seems reasonably fast for non-type-casting assignments, - both float32 and float64. For assignments involving type casting the performance is usually twice as bad as the worst performance for multiple column assignment

import pandas as pd
import numpy as np
from scipy.signal import lfilter

N = 1000
M = 1000deff(dtype1, dtype2):
    coi = [str(m) for m inrange(M)]
    df = pd.DataFrame([[m for m inrange(M)] + ['Hello', 'World'] for n inrange(N)],
                      columns=coi + ['A', 'B'], dtype=dtype1)
    Y = lfilter([1], [0.5, 0.5], df.ix[:, coi])
    Y = Y.astype(dtype2)
    new = df.copy()
    print(new.iloc[0, 0].dtype)
    print(Y.dtype)
    for n, column inenumerate(coi):  # For-loop over columns new!
        new.ix[:, column] = Y[:, n]
    print(new.iloc[0, 0].dtype)

from time import time

dtypes = [np.float32, np.float64]
for dtype1 in dtypes:
    for dtype2 in dtypes:
        print('-' * 10)
        start_time = time()
        f(dtype1, dtype2)
        print(time() - start_time)

The result is:

----------
float32float32float320.809890985489
----------
float32float64float6421.4767119884
----------
float64float32float3220.5611870289
----------
float64float64float640.765362977982

Post a Comment for "Assignment Of Pandas Dataframe With Float32 And Float64 Slow"