Identify Letter/number Combinations Using Regex And Storing In Dictionary
import pandas as pd df = pd.DataFrame({'Date':['This 1-A16-19 person is BL-17-1111 and other', 'dont Z-1-12 do here but NOT 12-24-1981',
Solution 1:
This regex might do the trick.
(?=.*[a-zA-Z])(\S+-\S+-\S+)
It matches everything between two spaces that has two -
in it. Also there won't be a match if there is no letter present.
As you can see for the given input you provided only 1-A16-19
, BL-17-1111
, Z-1-12
& 1A-256-29Q88
are getting returned.
Solution 2:
you could try :
vals = df['Date'].str.extractall(r'(\S+-\S+-\S+)')[0].tolist()
# extract your strings based on your condition above and pass to a list.# make a list with the index range of your matches.
nums = []
for x,y inenumerate(vals):
nums.append(x)
pass both lists into a dictionary.
my_dict = dict(zip(nums,vals))
print(my_dict)
{0: '1-A16-19',
1: 'BL-17-1111',
2: 'Z-1-12',
3: '12-24-1981',
4: '1A-256-29Q88'}
if you want the index to start at one you can specify this in the enumerate
function.
for x,y in enumerate(vals,1):
nums.append(x)
print(nums)
[1, 2, 3,4,5]
Post a Comment for "Identify Letter/number Combinations Using Regex And Storing In Dictionary"