Skip to content Skip to sidebar Skip to footer

How To Count Number Of Unique Lists Within List?

I've tried using Counter and itertools, but since a list is unhasable, they don't work. My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ] I would like to know that the list [

Solution 1:

>>>from collections import Counter>>>li=[ [1,2,3], [2,3,4], [1,2,3] ]>>>Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})

The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:

>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})

If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).

Solution 2:

I think, using the Counter class on tuples like

Counter(tuple(item) for item in li)

Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).

The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.

In order to improve on performance, you would have to

  • Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
  • Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)

At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.

As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.

So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.

Solution 3:

ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))

Also, if you wanted to count the occurences of a unique* list:

print(ll.count([1,2,3]))

*value unique, not reference unique)

Solution 4:

list=  [ [1,2,3], [2,3,4], [1,2,3] ]
repeats= []
unique=0for i in list:count=0;if i not in repeats:for i2 in list:ifi==i2:count+=1ifcount>1:repeats.append(i)elifcount==1:unique+=1print"Repeated Items"for r in repeats:printr,print"\nUnique items:",unique

loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

Post a Comment for "How To Count Number Of Unique Lists Within List?"