Vectorized Format Function For Pandas Series
Say I start with a Series of unformatted phone numbers (as strings), and I would like to format them as (XXX) YYY-ZZZZ. I can get the sub-components of my input using regular exp
Solution 1:
You can do this directly with Series.str.replace()
:
In [47]: s = pandas.Series(["1234567890", "5552348866", "13434"])
In [49]: s
Out[49]:
0 1234567890
1 5552348866
2 13434
dtype: object
In [50]: s.str.replace(r"(\d{3})(\d{3})(\d{4})", r"(\1) \2-\3")
Out[50]:
0 (123) 456-7890
1 (555) 234-8866
2 13434
dtype: object
You could also imagine doing another transformation first to remove any non-digit characters.
Solution 2:
Why don't you try this:
import pandas as pd
ser = pd.Series(data=['1234567890', '2345678901', '3456789012'])
def f(val):
return '({0}) {1}-{2}'.format(val[:3],val[3:6],val[6:])
print ser.apply(f)
Post a Comment for "Vectorized Format Function For Pandas Series"