Skip to content Skip to sidebar Skip to footer

Adding A Single Character To Add Keys In Counter

If the type of a Counter object's keys is str, i.e.: I could do this: >>> vocab_counter = Counter('the lazy fox jumps over the brown dog'.split()) >>> vocab_coun

Solution 1:

The better way would be adding that character before creating your counter object. You can do it using a generator expression within Counter:

In [15]: vocab_counter = Counter(w + u"\uE000"for w in"the lazy fox jumps over the brown dog".split())

In [16]: vocab_counter
Out[16]: Counter({'the\ue000': 2, 'fox\ue000': 1, 'dog\ue000': 1, 'jumps\ue000': 1, 'lazy\ue000': 1, 'over\ue000': 1, 'brown\ue000': 1})

If it's not possible to modify the words before creating the Counter you can override the Counter object in order to add the special character during setting the values for keys.

Solution 2:

Shortest way i used is,

vocab_counter = Counter("the lazy fox jumps over the brown dog".split()) 
for key in vocab_counter.keys():
  vocab_counter[key+u"\uE000"] = vocab_counter.pop(key)

Solution 3:

The only other optimised way I can think of is to use a subclass of Counter that appends the character when the key is inserted:

from collections import Counter


classCustomCounter(Counter):
    def__setitem__(self, key, value):
        iflen(key) > 1andnot key.endswith(u"\uE000"):
            key += u"\uE000"super(CustomCounter, self).__setitem__(key, self.get(key, 0) + value)

Demo:

>>> CustomCounter("the lazy fox jumps over the brown dog".split())
CustomCounter({u'the\ue000': 2, u'fox\ue000': 1, u'brown\ue000': 1, u'jumps\ue000': 1, u'dog\ue000': 1, u'over\ue000': 1, u'lazy\ue000': 1})
# With both args and kwargs >>> CustomCounter("the lazy fox jumps over the brown dog".split(), **{'the': 1, 'fox': 3})
CustomCounter({u'fox\ue000': 4, u'the\ue000': 3, u'brown\ue000': 1, u'jumps\ue000': 1, u'dog\ue000': 1, u'over\ue000': 1, u'lazy\ue000': 1})

Solution 4:

You could do it with string manipulations:

text = 'the lazy fox jumps over the brown dog'
Counter((text + ' ').replace(' ', '_abc ').strip().split())

Post a Comment for "Adding A Single Character To Add Keys In Counter"