Numpy: Selecting N Points Every M Points
Solution 1:
To select 10
elements off each block of 30
elements, we can simply reshape into 2D
and slice out the first 10
columns from each row -
a.reshape(-1,30)[:,:10]
The benefit is the output would be a view into the input and as such virtually free and without any extra memory overhead. Let's have a sample run to show and prove those -
In [43]: np.random.seed(0)
In [44]: a = np.random.randint(0,9,(1,300))
In [48]: np.shares_memory(a,a.reshape(10,30)[0,:,:10])
Out[48]: True
If you need a flattened version, use .ravel()
-
a.reshape(-1,30)[:,:10].ravel()
Timings -
In [38]:a=np.random.randint(0,9,(300))# @sacul's solnIn [39]:%%timeit...:msk= [True] *10+ [False] *20...:out=a[np.tile(msk,len(a)//len(msk))]100000loops,best of 3:7.6µsperloop# From this postIn [40]:%timeita.reshape(-1,30)[:,:10].ravel()1000000loops,best of 3:1.07µsperloopIn [41]:a=np.random.randint(0,9,(3000000))# @sacul's solnIn [42]:%%timeit...:msk= [True] *10+ [False] *20...:out=a[np.tile(msk,len(a)//len(msk))]100loops,best of 3:3.66msperloop# From this postIn [43]:%timeita.reshape(-1,30)[:,:10].ravel()100loops,best of 3:2.32msperloop# If you are okay with `2D` output, it is virtually freeIn [44]:%timeita.reshape(-1,30)[:,:10]1000000loops,best of 3:519nsperloop
Generic case with 1D
array
A. No. of elements being multiple of block length
For a 1D
array a
with number of elements being a multiple of n
, to select m
elements off each block of n
elements and get a 1D
array output, we would have :
a.reshape(-1,n)[:,:m].ravel()
Note that ravel()
flattening part makes a copy there. So, if possible keep the unflattened 2D
version for memory efficiency.
Sample run -
In [59]: m,n = 2,5
In [60]: N = 25
In [61]: a = np.random.randint(0,9,(N))
In [62]: a
Out[62]:
array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1, 5, 8, 4,
3, 0, 3])
# Select 2 elements off each block of 5 elements
In [63]: a.reshape(-1,n)[:,:m].ravel()
Out[63]: array([5, 0, 3, 5, 6, 8, 7, 7, 8, 4])
B. Generic no. of elements
We would leverage np.lib.stride_tricks.as_strided
, inspired by this post
to select m
elements off each block of n
elements -
defskipped_view(a, m, n):
s = a.strides[0]
strided = np.lib.stride_tricks.as_strided
shp = ((a.size+n-1)//n,n)
return strided(a,shape=shp,strides=(n*s,s), writeable=False)[:,:m]
defslice_m_everyn(a, m, n):
a_slice2D = skipped_view(a,m,n)
extra = min(m,len(a)-n*(len(a)//n))
L = m*(len(a)//n) + extra
return a_slice2D.ravel()[:L]
Note that skipped_view
gets us a view into the input array and possibly into memory region not assigned to the input array, but after that we are flattening and slicing to restrict it to our desired output and that's a copy.
Sample run -
In [170]: np.random.seed(0)
...: a = np.random.randint(0,9,(16))
In [171]: a
Out[171]: array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7])
# Select2 elements off each block of5 elements
In [172]: slice_m_everyn(a, m=2, n=5)
Out[172]: array([5, 0, 3, 5, 6, 8, 7])
In [173]: np.random.seed(0)
...: a = np.random.randint(0,9,(19))
In [174]: a
Out[174]: array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1])
# Select2 elements off each block of5 elements
In [175]: slice_m_everyn(a, m=2, n=5)
Out[175]: array([5, 0, 3, 5, 6, 8, 7, 7])
Solution 2:
You could create a mask and index by the mask, repeated until it reaches the length of your array:
msk = [True] * 10 + [False] * 20
arr[np.tile(msk, len(arr)//len(msk))]
Minimal example:
In an array of 30 values, select 1 element, then skip 2 elements:
>>> arr
array([6, 7, 2, 7, 1, 9, 1, 4, 4, 8, 6, 5, 2, 6, 3, 6, 8, 5, 6, 7, 2, 1, 9,
6, 7, 2, 1, 8, 2, 2])
msk = [True] * 1 + [False] * 2>>> arr[np.tile(msk, len(arr)//len(msk))]
array([6, 7, 1, 8, 2, 6, 6, 1, 7, 8])
Explanation:
msk
is a boolean mask
>>> msk
[True, False, False]
You can then repeat that mask with np.tile
, until it is the same length as your original array (i.e. the length of your array divided by the length of your mask):
>>> np.tile(msk, len(arr)//len(msk))
array([ True, False, False, True, False, False, True, False, False,
True, False, False, True, False, False, True, False, False,
True, False, False, True, False, False, True, False, False,
True, False, False], dtype=bool)
Then it's a simple matter of indexing by a boolean, which numpy
excels at
Solution 3:
IIUC
get = 10skip = 20k = [item for z in [np.arange(get) + idx for idx in np.arange(0, x.size, skip+get)] for item in z]
Then just slice
x[k]
Example:
x = np.arange(100)
x[k]
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93,
94, 95, 96, 97, 98, 99])
Post a Comment for "Numpy: Selecting N Points Every M Points"