3

I came across the following piece of code while studying Numpy:

import numpy as np

import time
import sys
S= range(1000)
print(sys.getsizeof(5)*len(S))

D= np.arange(1000)
print(D.size*D.itemsize)

The output of this is:

O/P -  14000

4000

So Numpy saves memory storage. But I want to know how does Numpy do it?

Source: https://www.edureka.co/blog/python-numpy-tutorial/

Edit: This question only answers half of my question. Doesn't mention anything regarding what the Numpy module does.

1
  • @FlyingTeller my question is regarding Numpy. The link you posted only covers half my answer. Commented Jul 9, 2018 at 8:56

2 Answers 2

0

In your example, D.size == len(S), so the difference is due to the difference between D.itemsize (8) and sys.getsizeof(5) (28).

D.dtype shows you that NumPy used int64 as the data type, which uses (unsurprisingly) 64 bits == 8 bytes per item. This is really only the raw numerical data, similar to a data type in C (under the hood it pretty much is exactly that).

In contrast, Python uses an int for storing the items, which (as pointed out the question linked to by FlyingTeller) is more than just the raw numerical data.

Sign up to request clarification or add additional context in comments.

Comments

0

A ndarray stores its data in a contiguous data buffer

For an example in my current ipython session:

In [63]: x.shape
Out[63]: (35, 7)
In [64]: x.dtype
Out[64]: dtype('int64')
In [65]: x.size
Out[65]: 245
In [66]: x.itemsize
Out[66]: 8
In [67]: x.nbytes
Out[67]: 1960

The array referenced by x has a block of memory with info like shape and strides, and this data buffer that takes up 1960 bytes.

Identifying the memory use of a list, e.g. xl = x.tolist() is trickier. len(xl) is 35, that is, it's databuffer has 35 pointers. But each pointer references a different list of 7 elements. Each of those lists has pointers to numbers. In my example the numbers are all integers less than 255, so each is unique (repeats point to the same object). For larger integers and floats there will be a separate Python object for each. So the memory footprint of a list depends on the degree of nesting as well as the type of the individual elements.

ndarray can also have object dtype, in which case it too contains pointers to objects elsewhere in memory.

And another nuance - the primary pointer buffer of a list is slightly oversized, to make append faster.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.