0

In the following example, I initialize a 500x2000x2000 three-dimensional NumPy array named a. At each iteration, a random two-dimensional array r is inserted into array a. This example represents a larger piece of code where the r array would be created from various calculations during each iteration of the for-loop. Consequently, each slice in the z dimension of the a array is calculated at each iteration.

# ex1_basic.py

import numpy as np
import time

def main():
    tic = time.perf_counter()

    z = 500     # depth
    x = 2000    # rows
    y = 2000    # columns

    a = np.zeros((z, x, y))

    for i in range(z):
        r = np.random.rand(x, y)
        a[i] = r

    toc = time.perf_counter()
    print('elapsed =', round(toc - tic, 2), 'sec')

if __name__ == '__main__':
    main()

This example's memory is profiled using the memory-profiler package. The steps for running the memory-profiler for this example are:

# Run the memory profiler
$ mprof run ex1_basic.py

# Plot the memory profiler results
$ mprof plot

The memory usage is plotted below. The memory usage increases over time as the values are added to the array.

figure 1

I profiled another example where the data type for the array is defined as np.float32. See below for the example code and memory usage plot. This decreased the overall memory use but the memory still increases with each iteration.

import numpy as np
import time

def main():

    rng = np.random.default_rng()

    tic = time.perf_counter()

    z = 500     # depth
    x = 2000    # rows
    y = 2000    # columns

    a = np.zeros((z, x, y), dtype=np.float32)

    for i in range(z):
        r = rng.standard_normal((x, y), dtype=np.float32)
        a[i] = r

    toc = time.perf_counter()
    print('elapsed =', round(toc - tic, 2), 'sec')

if __name__ == '__main__':
    main()

figure 2

Since I initialized array a using np.zeros, I would expect the memory usage to remain constant based on the block of memory initialized for that array. But it appears that memory usage increases as values are inserted into array a.

So I have two questions related to these examples:

  1. Why does the memory usage increase with time?
  2. How do I create and store array a on disk and only have slice a[i] and the r array in memory at each iteration? Basically, how would I run these examples if the a array did not fit in memory (RAM)?

Update

I ran an example using numpy.memmap but there is no improvement in memory usage. It seems like memmap is still keeping the entire array in memory.

import numpy as np
import time

def main():

    rng = np.random.default_rng()

    tic = time.perf_counter()

    z = 500
    x = 2000
    y = 2000

    a = np.memmap('file.dat', dtype=np.float32, mode='w+', shape=(z, x, y))

    for i in range(z):
        r = rng.standard_normal((x, y), dtype=np.float32)
        a[i] = r

    toc = time.perf_counter()
    print('elapsed =', round(toc - tic, 2), 'sec')

if __name__ == '__main__':
    main()

figure 3

13
  • 1
    This has to do with optimization of sparse, big tensors. As you define values, memory is lazily allocated: stackoverflow.com/questions/27574881/… Commented Jun 3, 2022 at 19:54
  • What happens if you use a = np.ones((z, x, y))? Commented Jun 3, 2022 at 20:01
  • 1
    As an aside, what is the point of looping here? why not a = np.random.rand(x, y, z)??? or rng.standard_normal((x, y, z), dtype=np.float32) Commented Jun 3, 2022 at 20:35
  • 2
    What is surprising - the peak use, or the gradual increase? Commented Jun 3, 2022 at 20:48
  • 1
    @max9111 Can you submit an answer that demonstrates chunk_shape, chunk_cache, and custom object using h5py? You can compare the elapsed time and storage size to the example in the current answer. Commented Jun 8, 2022 at 13:02

1 Answer 1

1

Using the h5py package, I can create an hdf5 file that contains a dataset that represents the a array. The dset variable is similar to the a variable discussed in the question. This allows the array to reside on disk, not in memory. The generated hdf5 file is 8 GB on disk which is the size of the array containing np.float32 values. The elapsed time for this approach is similar to the examples discussed in the question; therefore, writing to the hdf5 file seems to have a negligible performance impact.

import numpy as np
import h5py
import time

def main():

    rng = np.random.default_rng()

    tic = time.perf_counter()

    z = 500   # depth
    x = 2000  # rows
    y = 2000  # columns

    f = h5py.File('file.hdf5', 'w')
    dset = f.create_dataset('data', shape=(z, x, y), dtype=np.float32)

    for i in range(z):
        r = rng.standard_normal((x, y), dtype=np.float32)
        dset[i, :, :] = r

    toc = time.perf_counter()
    print('elapsed time =', round(toc - tic, 2), 'sec')

    s = np.float32().nbytes * (z * x * y) / 1e9  # where 1 GB = 1000 MB
    print('calculated storage =', s, 'GB')

if __name__ == '__main__':
    main()

Output from running this example on a MacBook Pro with 2.6 GHz 6-Core Intel Core i7 and 32 GB of RAM:

elapsed time = 22.97 sec
calculated storage = 8.0 GB

Running the memory profiler for this example gives the plot shown below. The peak memory usage is about 100 MiB which is drastically lower than the examples demonstrated in the question.

hdf5 plot

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.