Skip to content Skip to sidebar Skip to footer

How To Query Documents In Mongodb (pymongo) Where All Keywords Exist In A Field?

I have a list of keywords: keywords = ['word1', 'word2', 'word3'] For now I query for only 1 keyword like this: collection.find({'documenttextfield': {'$regex': ' '+keyword+' '}})

Solution 1:

Consider using a text index with a $text search. It might be a far better solution than using regular expressions. However, text search returns documents based on a scoring-algorithm, so you might get some results which don't have all the keywords you are looking for.

If you can't or don't want to add a text index to this field, using a single regular expression would be quite a pain because you don't know the order in which these words appear. I don't claim it is impossible to write, but you will end up with a horrible abomination even for regex standards. It would be far easier to use the regex operator multiple time by using the $and operator.

Also, using a space as delimeter is going to fail when the word is at the beginning or end of the string or followed by a period or comma. Use the word-boundary token (\b) instead.

collection.find(
    { $and : [
              {'documenttextfield': {'$regex': '\b' +keyword1+'\b'}},
              {'documenttextfield': {'$regex': '\b' +keyword2+'\b'}},
              {'documenttextfield': {'$regex': '\b' +keyword3+'\b'}},
         ]
    });

Keep in mind that this is a really slow query, because it will run these three regular expressions on every single document of the collection. When this is a performance-critical query, seriously consider if a text index really won't do. Failing this, the last straw to grasp would be to extract any keywords from the documenttextfield field someone could search for (which might be every unique word in it) into a new array-field documenttextfield_keywords, create a normal index on that field, and search on that field with the $all operator (no regular expression required in that case).

Post a Comment for "How To Query Documents In Mongodb (pymongo) Where All Keywords Exist In A Field?"