2

I have a nested for loop:

import numpy as np

ccounter = np.zeros(shape=(120, 200))
lat_idx = np.random.randint(120, size=4800)
lon_idx = np.random.randint(200, size=(4800, 4800))
for j in range(4800):
    for i in range(4800):
        ccounter[lat_idx[i], lon_idx[i, j]] +=1 

This is obviously very slow. Is it possible to avoid the for loops and implement it as e.g. matrix operation?

2
  • what do you want to achieve? Commented Nov 22, 2019 at 11:11
  • @warped I want to count how often the same combinations of lat_idx and lon_idx (the order is important) occur. In the example above I used randoms for simplicity. There is probably a more elegant solution. Commented Nov 22, 2019 at 11:15

1 Answer 1

3

Here's a vectorized approach with np.bincount -

# Get matrix extents for output
k = lon_idx.max()+1 # 200 for given sample
n = lat_idx.max()+1 # 120 for given sample

# Get linear index equivalent
lidx = lat_idx[:,None]*k+lon_idx

# Use those indices as bins for binned count. Reshape for final o/p
out = np.bincount(lidx.ravel(),minlength=n*k).reshape(n,k)

To improve the performance a bit more for large arrays, we can leverage numexpr to get lidx -

import numexpr as ne

lidx = ne.evaluate('lat_idx2D*k+lon_idx',{'lat_idx2D':lat_idx[:,None]})
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.