Removing From A String All The Characthers Included Between Two Specific Characters In Python
What's a fast way in Python to take all the characters included between two specific characters out of a string?
Solution 1:
You can use this regular expression: \(.*?\)
. Demo here: https://regexr.com/3jgmd
Then you can remove the part with this code:
import re
test_string = 'This is a string (here is a text to remove), and here is a text not to remove'
new_string = re.sub(r" \(.*?\)", "", test_string)
This regular expression (regex) will look for any text (without line break) in brackets prepended by a space
Solution 2:
You will most probably use a regular expression like
\s*\([^()]*\)\s*
for that (see a demo on regex101.com).
The expression removes everything in parentheses and surrounding whitespaces.
In
Python
this could be:
import re
test_string = 'This is a string (here is a text to remove), and here is a text not to remove'
new_string = re.sub(r'\s*\([^()]*\)\s*', '', test_string)
print(new_string)
# This is a string, and here is a text not to remove
However, for learning purposes, you could as well go with the builtin methods:
test_string = 'This is a string (here is a text to remove), and here is a text not to remove'
left = test_string.find('(')
right = test_string.find(')', left)
if left and right:
new_string = test_string[:left] + test_string[right+1:]
print(new_string)
# This is a string , and here is a text not to remove
Problem with the latter: it does not account for multiple occurences and does not remove whitespaces but it is surely faster.
Executing this a 100k times each, the measurements yield:
0.578398942947 # regex solution
0.121736049652 # non-regex solution
Solution 3:
to remove all text in ( and ) you can use findall()
method from re
and remove them using replace()
:
import re
test_string = 'This is a string (here is a text to remove), and here is a (second one) text not to remove'
remove = re.findall(r" \(.*?\)",test_string)
for r in remove:
test_string = test_string.replace(r,'')
print(test_string)
#result: This is a string , and here is a text not to remove
Post a Comment for "Removing From A String All The Characthers Included Between Two Specific Characters In Python"