Skip to content Skip to sidebar Skip to footer

Python Count The Number Of Substring In List From Other String List Without Duplicates

I have two list: main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42'] master_list = ['Smith', 'Roger'] I want to count the number of times I find a string from master_list

Solution 1:

A one liner

>>>sum(any(m in L for m in master_list) for L in main_list)
4

Iterate over main_list and check if any of the values from master_list are in that string. This leaves you with a list of bool values. It will stop after it finds one and so adds only one to the count for each string. Conveniently sum counts all the Trues to give you the count.

Solution 2:

You can use pandas (which provide fast vectorized operations) with str.contains and sum()

import pandas as pd
main_list = pd.Series(['Smith', 'Smith', 'Roger', 'Roger-Smith', '42'])
master_list = ['Smith', 'Roger']
count = main_list.str.contains('|'.join(master_list)).sum()

Solution 3:

You can do it other way around. Create list that will contain only elements from main_list that have substring from master_list

temp_list = [ stringforstringin main_list ifany(substring instringfor substring in master_list)]

Now temp_list looks like this:

['Smith', 'Smith', 'Roger', 'Roger-Smith']

So the length of temp_list is your answer.

Solution 4:

What about this

main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']

printlen([word for word in main_list ifany(mw in word for mw in master_list)])

Solution 5:

This would do it:

main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']

i = 0
for elem in main_list:
    if elem in master_list:
        i += 1
        continuefor master_elem in master_list:
        if master_elem in elem:
            i += 1
            breakprint(i) # i = 4

The code above counts 'Roger-Smith' as 1, if you want it to count as multiple, remove the break.

Post a Comment for "Python Count The Number Of Substring In List From Other String List Without Duplicates"