How Do I Count All Occurrences Of A Phrase In A Text File Using Regular Expressions?
Solution 1:
You can get rid of the regex entirely, the count-method of string objects is enough, much of the other code can be simplified as well.
You're also not changing data to lower case, just printing the string as lower case, note how I use data = data.lower()
to actually change the variable.
Try this code:
import glob
import os
path = 'c:\script\lab\Tests'
k = 0
substring = ' at least '
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
data = f.read()
data = data.lower()
S= data.count(substring)
if S:
k= k + 1
print("'{}' match".format(filename), S)
else:
print("'{}' no match".format(filename))
print("Total number of matches", k)
If anything is unclear feel free to ask!
Solution 2:
You make multiple mistakes in your code. data.split()
and data.lower()
have no effect at all, since the both do not modifiy data but return a modified version. However, you don't assign the return value to anything, so it is lost.
Also, you should always close a resource (e.g. a file) when you don't need it anymore.
Also, you append every string you find using re.search
to a list S, which you dont use for anything anymore. It would also be pointless, because it would just contain the string you are looking for x amount of time. You can just take the list that is returned by re.search
and comupute its length. This gives you the number of times it occurs in the text. Then you just increase your counter variable k
by that amount and move on to the next file. You can still have your print statements by simply printing the temporary num_found
variable.
import re
import glob
import os
path = 'D:/Test'
k = 0
for filename in glob.glob(os.path.join(path, '*.txt')):
if filename.endswith('.txt'):
f = open(filename)
text = f.read()
f.close()
num_found = len(re.findall(r' at least ', data, re.MULTILINE))
k += num_found
Post a Comment for "How Do I Count All Occurrences Of A Phrase In A Text File Using Regular Expressions?"