3

Currently I am accessing multiple slices as follows:

First, I allocate an array that will be re-assigned many times

X = np.zeros( (batch_size, window, 5) )

This is the assignment loop that will be run multiple times (batch_indices has different indices each time but the same shape):

for i, b in enumerate(batch_indices):
    X[i] = Xs[b:b+window]

Is there a more efficient way? I feel like there should be syntax similar to:

X = Xs[ [slice(b,b+window) for b in batch_indices] ]

While the shape of Xs is 2-dimensional, the final shape of X should be a 3-dimensional np.array. Think of it as follows: Xs is one long multi-dimensional time-series, and X needs to be a numpy array containing many slices of the multi-dimensional time-series.

3
  • have you looked into numpy iterations? Commented Jul 17, 2017 at 20:04
  • have you try boolean indexes? worth a performance check... Commented Jul 17, 2017 at 20:09
  • Did the posted solution work for you? Commented Jul 25, 2017 at 11:58

1 Answer 1

2

Approach #1

One vectorized approach would be to create all those sliding windowed indices and index into Xs with those, like so -

X = Xs[np.asarray(batch_indices)[:,None] + np.arange(window)]

Approach #2

Another memory efficient approach would be to create sliding-windows with np.lib.stride_tricks.as_strided, thus avoiding the creation of the sliding windowed indices as done in the previous approach and simply index with batch_indices, like so -

X = strided_axis0(Xs,window)[np.asarray(batch_indices)]

Strides based function strided_axis0 is from here.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.