Dataframe String Manipulation
I have a dataframe that has one column with data that looks like this: AAH. AAH. AAR.UN AAR.UN AAR.UN AAR.UN AAV. AAV. AAV. I think I need to use the apply method to trim the colu
Solution 1:
Since there's only one column, you can take advantage of vectorized string operations via .str
(docs):
>>>df
0
0 AAH.
1 AAH.
2 AAR.UN
3 AAR.UN
4 AAR.UN
5 AAR.UN
6 AAV.
7 AAV.
8 AAV.
>>>df[0] = df[0].str.rstrip('.')>>>df
0
0 AAH
1 AAH
2 AAR.UN
3 AAR.UN
4 AAR.UN
5 AAR.UN
6 AAV
7 AAV
8 AAV
Otherwise you'd have to do something like df.applymap(lambda x: x.rstrip("."))
, or drop down to numpy char
methods.
Solution 2:
You can also use lambda function to do this:
>>>L = [['AAH.'],
['AAR.UN'],
['AAR.UN'],
['AAV.'],
['AAV.']]
>>>df = pd.DataFrame(L)>>>M = lambda x: x[0][:-1] if x[0][-1]=='.'else x[0][:]>>>df = df.apply(M, axis=1)>>>df
0 AAH
1 AAR.UN
2 AAR.UN
3 AAV
4 AAV
Solution 3:
def change_to_date(string):
seq = (string[:2],string[2:5],string[5:])
return'-'.join(seq)
pt['DATE'] = pt['DATE'].apply(change_to_date)
I applied a simple function to the column to manipulate all string values, for somewhat similar problem.
Post a Comment for "Dataframe String Manipulation"