Python String Split Using Regex
I need to parse a line like these: foo, bar > 1.0, baz = 2.0 foo bar > 1.0 baz = 2.0 foo, bar, baz foo bar baz for each element it can be $string (>|<|<=|>=|
Solution 1:
You can split at every non alphabetic characters
re.split("[^a-zA-Z]+",input)
Though am assuming that your $string
contain only alphabets..
You can remove empty results with filter
filter(None, str_list)
Solution 2:
You can just extract all the letter groups:
s = """
foo, bar > 1.0, baz = 2.0
foo bar > 1.0 baz = 2.0
foo, bar, baz
foo bar baz
"""import re
regex = re.compile(r'([a-z]+)', re.I) # re.I (ignore case flag)for line in s.splitlines():
ifnot line:
continue# skip empty linesprint regex.findall(line)
>>>
['foo', 'bar', 'baz']
['foo', 'bar', 'baz']
['foo', 'bar', 'baz']
['foo', 'bar', 'baz']
Solution 3:
This one checks for the syntax also:
import re
withopen("input") as f:
for line in f:
line = line.strip()
# chop a line into expressions of the form: str [OP NUMBER]
exprs = re.split(r'(\w+\s*(?:[!<>=]=?\s*[\d.]*)?\s*,?\s*)', line)
for expr in exprs:
# chop each expression into tokens and get the str part
tokens = re.findall(r'(\w+)\s*(?:[!<>=]=?\s*[\d.]*)?,?', expr)
if tokens: print tokens
Post a Comment for "Python String Split Using Regex"