Skip to content Skip to sidebar Skip to footer

Identify Letter/number Combinations Using Regex And Storing In Dictionary

import pandas as pd df = pd.DataFrame({'Date':['This 1-A16-19 person is BL-17-1111 and other', 'dont Z-1-12 do here but NOT 12-24-1981',

Solution 1:

This regex might do the trick.

(?=.*[a-zA-Z])(\S+-\S+-\S+)

It matches everything between two spaces that has two - in it. Also there won't be a match if there is no letter present.

regex101 example

As you can see for the given input you provided only 1-A16-19, BL-17-1111, Z-1-12 & 1A-256-29Q88 are getting returned.

Solution 2:

you could try :

vals = df['Date'].str.extractall(r'(\S+-\S+-\S+)')[0].tolist() 
# extract your strings based on your condition above and pass to a list.# make a list with the index range of your matches.
nums = []
for x,y inenumerate(vals):
    nums.append(x)

pass both lists into a dictionary.

my_dict = dict(zip(nums,vals))
print(my_dict)
 {0: '1-A16-19',
 1: 'BL-17-1111',
 2: 'Z-1-12',
 3: '12-24-1981',
 4: '1A-256-29Q88'}

if you want the index to start at one you can specify this in the enumerate function.

for x,y in enumerate(vals,1):
    nums.append(x)
print(nums)
[1, 2, 3,4,5]

Post a Comment for "Identify Letter/number Combinations Using Regex And Storing In Dictionary"