Skip to content Skip to sidebar Skip to footer

Python String Split Using Regex

I need to parse a line like these: foo, bar > 1.0, baz = 2.0 foo bar > 1.0 baz = 2.0 foo, bar, baz foo bar baz for each element it can be $string (>|<|<=|>=|

Solution 1:

You can split at every non alphabetic characters

re.split("[^a-zA-Z]+",input)

Though am assuming that your $string contain only alphabets..


You can remove empty results with filter

filter(None, str_list)

Solution 2:

You can just extract all the letter groups:

s = """
foo, bar > 1.0, baz = 2.0
foo  bar > 1.0  baz = 2.0
foo, bar, baz
foo  bar  baz
"""import re
regex = re.compile(r'([a-z]+)', re.I)  # re.I (ignore case flag)for line in s.splitlines():
    ifnot line:
        continue# skip empty linesprint regex.findall(line)

>>> 
['foo', 'bar', 'baz']
['foo', 'bar', 'baz']
['foo', 'bar', 'baz']
['foo', 'bar', 'baz']

Solution 3:

This one checks for the syntax also:

import re
withopen("input") as f:
    for line in f:
        line = line.strip()
        # chop a line into expressions of the form: str [OP NUMBER]
        exprs = re.split(r'(\w+\s*(?:[!<>=]=?\s*[\d.]*)?\s*,?\s*)', line)
        for expr in exprs:
            # chop each expression into tokens and get the str part
            tokens = re.findall(r'(\w+)\s*(?:[!<>=]=?\s*[\d.]*)?,?', expr)
            if tokens: print tokens

Post a Comment for "Python String Split Using Regex"