Skip to content Skip to sidebar Skip to footer

Count Of The Number Of Identical Values In Two Arrays For All The Unique Values In An Array

I have two arrays A and B. A has multiple values (these values can be string or integer or float) and B has values 0 and 1. I need, for each unique value in A, the count of points

Solution 1:

Selective use of np.bincount should do the trick

Au, Ai = np.unique(A, return_index = True)

out = np.empty((2, Au.size))
out[0] = np.bincount(Ai, weight = 1-np.array(B), size = Au.size)
out[1] = bp.bincount(Ai, weight = np.array(B),   size = Au.size)

outdict = {}

for i inrange(Au.size):
    for j in [0, 1]:
        outdict[(Au(i), j)] = out[j, i]

Solution 2:

It's much easier to use pandas to do this kind of groupby operation:

In [11]: import pandas as pd

In [12]: df = pd.DataFrame({"A": A, "B": B})

In [13]: df
Out[13]:
   A  B
010110230321421511610731830

Now you can use groupby:

In [14]: gb = df.groupby("A")["B"]

In [15]: gb.count()  # number of AsOut[15]:
A
1    4
2    2
3    3
Name: B, dtype: int64

In [16]: gb.sum()  # number of As where B == 1Out[16]:
A
1    1
2    2
3    1
Name: B, dtype: int64

In [17]: gb.count() - gb.sum()  # number of As where B == 0Out[17]:
A
1    3
2    0
3    2
Name: B, dtype: int64

You can also do this more explicitly and more generally (e.g. if it's not just 0 and 1) with an apply:

In[18]: gb.apply(lambda x: (x == 1).sum())
Out[18]:
A112231Name: B, dtype: int64

Post a Comment for "Count Of The Number Of Identical Values In Two Arrays For All The Unique Values In An Array"