Skip to content Skip to sidebar Skip to footer

Dataframe String Manipulation

I have a dataframe that has one column with data that looks like this: AAH. AAH. AAR.UN AAR.UN AAR.UN AAR.UN AAV. AAV. AAV. I think I need to use the apply method to trim the colu

Solution 1:

Since there's only one column, you can take advantage of vectorized string operations via .str (docs):

>>>df
        0
0    AAH.
1    AAH.
2  AAR.UN
3  AAR.UN
4  AAR.UN
5  AAR.UN
6    AAV.
7    AAV.
8    AAV.
>>>df[0] = df[0].str.rstrip('.')>>>df
        0
0     AAH
1     AAH
2  AAR.UN
3  AAR.UN
4  AAR.UN
5  AAR.UN
6     AAV
7     AAV
8     AAV

Otherwise you'd have to do something like df.applymap(lambda x: x.rstrip(".")), or drop down to numpy char methods.

Solution 2:

You can also use lambda function to do this:

>>>L = [['AAH.'],
         ['AAR.UN'],
         ['AAR.UN'],
         ['AAV.'],
         ['AAV.']]

>>>df = pd.DataFrame(L)>>>M = lambda x: x[0][:-1] if x[0][-1]=='.'else x[0][:]>>>df = df.apply(M, axis=1)>>>df
0       AAH
1    AAR.UN
2    AAR.UN
3       AAV
4       AAV

Solution 3:

def change_to_date(string):
    seq = (string[:2],string[2:5],string[5:])
    return'-'.join(seq)

pt['DATE'] = pt['DATE'].apply(change_to_date)

I applied a simple function to the column to manipulate all string values, for somewhat similar problem.

Post a Comment for "Dataframe String Manipulation"