Skip to content Skip to sidebar Skip to footer

Vectorized Format Function For Pandas Series

Say I start with a Series of unformatted phone numbers (as strings), and I would like to format them as (XXX) YYY-ZZZZ. I can get the sub-components of my input using regular exp

Solution 1:

You can do this directly with Series.str.replace():

In [47]: s = pandas.Series(["1234567890", "5552348866", "13434"])

In [49]: s
Out[49]: 
0    1234567890
1    5552348866
2         13434
dtype: object

In [50]: s.str.replace(r"(\d{3})(\d{3})(\d{4})", r"(\1) \2-\3")
Out[50]: 
0    (123) 456-7890
1    (555) 234-8866
2             13434
dtype: object

You could also imagine doing another transformation first to remove any non-digit characters.


Solution 2:

Why don't you try this:

import pandas as pd
ser = pd.Series(data=['1234567890', '2345678901', '3456789012']) 
def f(val):
    return '({0}) {1}-{2}'.format(val[:3],val[3:6],val[6:])
print ser.apply(f)

Post a Comment for "Vectorized Format Function For Pandas Series"