Function in cython changes numpy array type

Question

I am working with Cython and numpy, and have a strange issue to do with a cython function changing the dtype of the elements of a numpy array. Strangely, the dtype is only changed when the input type of the array is actually specified.

I am using Cython==0.29.11, numpy==1.15.4, python 3.6, on Ubuntu 18.04.

# cyth.pyx
cimport numpy as np

def test(x):
    print(type(x[0]))

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    print(type(x[0]))

Now cythonising this file and using the functions:

>>> from cyth import test, test_np
>>> import numpy as np
>>> a = np.array([1, 2], dtype=np.uint32)
>>> test(a)
<class 'numpy.uint32'>
>>> test_np(a)
<class 'int'>

So test works as expected, printing the type of the first element in the input array - a uint32. But test_np, which actually ensures that the type of the incoming array is uint32, now shows a regular Python int as the type of the first element.

Even trying to force the element to to be of the right type does not work, i.e. using:

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    cdef np.uint32_t el
    el = x[0]
    print(type(el))

still results in

>>> test_np(a)
<class 'int'>

Any help in understanding this discrepancy would be greatly appreciated.

ead · Accepted Answer · 2019-07-02 11:30:51Z

Cython doesn't change the type of the array, but returns an element of a slightly different type.

The data in numpy-array is stored as contiguous field of 32bit unsigned integers. Accessing x[0] means creating a Python-object (because Python interpreter cannot handle raw C-ints) - numpy has a dedicated wrapper class for every numpy-dtype and returns an np.uint32-object.

Cython on the other hand, maps all C integer types (e.g. long, int and so on) simple onto Python-integer (which make sense).

Now, when numpy is cimported, x[0] no longer means using __getitem__() of the numpy-array (which would return np.uint32-object) but a C-integer (in this case unsigned 4byte), which is converted to a Python-integer, because "return XXX" means in a def function means the result must be a Python-object.

Which does mean, that the array has a different type - the types are mapped differently when converted to Python-object by Cython.

If you want to access data as np.uint32-objects, you could call __getitem__ instead of [..] ([..] is translated by Cython as access to raw-C-data):

%%cython
cimport numpy as np

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    print(type(x[0]))                     # int
    print(type(x.__getitem__(0)))         # numpy.uint32

When you use typed memory views rather than ndarray, then calling __getitem__ directly will return a Python-integer __getitem__ of the memory view doesn't call __getitem__ of the underlying ndarray but accesses the data on the C-level. To call __getitem__ of the underlying object for memory view:

def test_np(np.uint32_t[:] x):
    print(type(x[0]))
    print(type(x.base.__getitem__(0))) # instead of x.__getitem__(0)

Collectives™ on Stack Overflow

Function in cython changes numpy array type

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related