How To Extract Arrays From An Arranged Numpy Array?
This is a relative question of the post How to extract rows from an numpy array based on the content?, and I used the following code to split rows based on the content in the colum
Solution 1:
Here's an approach considering pair of elements from each row as indexing tuples -
# Convert to linear index equivalentslidx = np.ravel_multi_index(arr[:,:2].T,arr[:,:2].max(0)+1)
# Get sorted indices of lidx. Using those get shifting indices.# Split along sorted input array along axis=0 using those.sidx = lidx.argsort()
out = np.split(arr[sidx],np.unique(lidx[sidx],return_index=1)[1][1:])
Sample run -
In [34]: arr
Out[34]:
array([[2, 7, 5],
[3, 4, 6],
[2, 3, 5],
[2, 7, 7],
[4, 4, 7],
[3, 4, 6],
[2, 8, 5]])
In [35]: out
Out[35]:
[array([[2, 3, 5]]), array([[2, 7, 5],
[2, 7, 7]]), array([[2, 8, 5]]), array([[3, 4, 6],
[3, 4, 6]]), array([[4, 4, 7]])]
For a detailed info on converting group of elements as indexing tuple, please refer to this post
.
Solution 2:
The numpy_indexed package (disclaimer: I am its author) contains functionality to efficiently perform these type of operations:
import numpy_indexed as npi
npi.group_by(a[:, :2]).split(a)
It has decent test coverage, so id be surprised if it tripped on your seemingly straightforward test case.
Solution 3:
If I apply that split line directly to your array I get your result, an empty array plus the original
In [136]: np.split(a,np.unique(a[:,1],return_index=True)[1][1:])
Out[136]:
[array([], shape=(0, 3), dtype=int32),
array([[2748309, 246211, 1],
[2748309, 246211, 2],
[2747481, 246201, 54]])]
But if I first sort the array on the 2nd column, as specified in the linked answer, I get the desired answer - with the 2 arrays switched
In [141]: sorted_a=a[np.argsort(a[:,1])]
In [142]: sorted_a
Out[142]:
array([[2747481, 246201, 54],
[2748309, 246211, 1],
[2748309, 246211, 2]])
In [143]: np.split(sorted_a,np.unique(sorted_a[:,1],return_index=True)[1][1:])
Out[143]:
[array([[2747481, 246201, 54]]),
array([[2748309, 246211, 1],
[2748309, 246211, 2]])]
Post a Comment for "How To Extract Arrays From An Arranged Numpy Array?"