3

I have a large 2d numpy array and I want to remove subsets of it and handle what remains to a function. I need to do this for many subsets, and thus I would ideally not want to create a copy of the array each time. The function doesn't change any values in the array.

mat = np.load(filename)
mat_1 = mat[:i,:]
mat_2 = mat[j:,:]

So far, mat_1 and mat_2 are views. Then I would like to do

mat_s = np.concatenate((mat_1,mat_2))
result = func(mat_s)

but without making a copy. Is this possible?

6
  • 1
    Why don't you just use mat[j:i,:]? Commented May 14, 2018 at 12:25
  • @Kasramvd this would be different from what he is doing. Just think of an array of shape (100, 1) with i=50 and j=20. The resulting np.concatenate creates an overlapping resulting array, whereas your mat[j:i,:] does not. Commented May 14, 2018 at 12:29
  • should have specified, but j is larger than i. That would just return an empy array. Commented May 14, 2018 at 12:30
  • Aha, so if j > i, then np.concatenate does not create an overlapping array, but concats two separated arrays. Still Kasramvd's solution won't work. filippo has shown a good way to deal with that. Commented May 14, 2018 at 12:32
  • @filippo Yes, that's right it will actually repeat some rows. Commented May 14, 2018 at 12:49

2 Answers 2

3

Since memory-views can only be created using a fixed set of strides, you will have to create a copy in your case, where mat.shape[0] > j > i.

That means views will only work, if you want to have a view to every x-th element in the array:

mat = np.arange(20)
view = mat[slice(0, 20, 4)]
view
# Out[41]: array([ 0,  4,  8, 12, 16])

So this only works for views to equally spaced cells. But if you want to have a view to one contiguous slice(0, i) and another contiguous slice(j, mat.shape[0]), it won't work. You'll have to make a copy.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! I didn't think to mention it, but the order of the elements in the array is not important, so this works nicely.
Great, I'm glad I was able to help!
-1

You can delete the rows that you want to remove and pass it directly into the function

mat = np.load(filename)
mat_s = np.delete(mat,list(range(i,j)),axis=0)

you can delete 2 diff subsets by adding the list of ranges like

mat_s = np.delete(mat,list(range(i,j))+list(range(k,l)),axis=0)

the above removes the rows i:j and k:l

1 Comment

This method also returns a copy of the array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.