Get All Values Before A Decimal Number / An Integer From A List Of Strings In Python
I have a list of strings and I want to split each string on a floating point number. If there is no floating point number in the string, I want to split it on a number. It should o
Solution 1:
You're regex is almost correct but you have to take in consideration that the .
and the digits after the dot might not be there. This can be achieved like this:
\s+(\d+(?:\.\d+)?)\s+
The difference is that you add the \.\d+
in a non-capturing group (?:xxxx)
that might be there or not be there by using the question mark after the group: (?:xxxx)?
Solution 2:
I suggest using
res = re.match(r'^(?:(?!.*\d\.\d)(.*?)\s*\b(\d+(?:\s*mg)?)\b\s*(.*)|((?:(?!\d+\.\d).)*?)\s*\b(\d+\.\d+(?:\s*mg)?)\b\s*(.*))$', i)
if res:
all_extract.append(list(filter(None, res.groups())))
See the regex demo.
Full Python demo without commented code:
import re
defshow():
newresult = ['Naproxen 500 Active ingredient Ph Eur','Croscarmellose sodium 22.0 mg Disintegrant Ph Eur','Povidone K90 11.0 Binder 56 Ph Eur','Water, purifieda','Silica, colloidal anhydrous 2.62 Glidant Ph Eur','Water purified 49 Solvent Ph Eur','Magnesium stearate 1.38 Lubricant Ph Eur']
all_extract = []
for i in newresult:
res = re.match(r'^(?:(?!.*\d\.\d)(.*?)\s*\b(\d+(?:\s*mg)?)\b\s*(.*)|((?:(?!\d+\.\d).)*?)\s*\b(\d+\.\d+(?:\s*mg)?)\b\s*(.*))$', i)
if res:
all_extract.append(list(filter(None, res.groups())))
else:
print("ONLY INTEGER")
regex_integer_part = re.split(r'\s+(\d+(?:\.\d+)?)\s+', i, 1)
all_extract.append(regex_integer_part)
return all_extract
print(show())
yields
[['Naproxen', '500', 'Active ingredient Ph Eur'], ['Croscarmellose sodium', '22.0 mg', 'Disintegrant Ph Eur'], ['Povidone K90', '11.0', 'Binder 56 Ph Eur'], ['Water, purifieda'], ['Silica, colloidal anhydrous', '2.62', 'Glidant Ph Eur'], ['Water purified', '49', 'Solvent Ph Eur'], ['Magnesium stearate', '1.38', 'Lubricant Ph Eur']]
Post a Comment for "Get All Values Before A Decimal Number / An Integer From A List Of Strings In Python"