3

I have a large Numpy matrix act with dtype=np.float32 and two vectors of the same length, raw_id and raw_label. I want to sort all 3 objects based on the values in raw_id. However, I get a memory error when running this script. I've isolated it to act[sortind,:] in the function below. How can I optimize the memory usage?

The arrray act is roughly 1400000 x 400, whereas raw_id and raw_label is 1400000 x 1 using dtype=np.float64. It will almost fit into my 12gb of memory along with the remaining variables that I have initialised.

def sort_by_id(act, raw_id, raw_label):
    sortind = np.argsort(raw_id)
    return act[sortind,:], raw_id[sortind], raw_label[sortind]

# calling function with same variables
act, raw_id, raw_label = sort_by_id(act, raw_id, raw_label)
7

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.