2

I am working with Cython and numpy, and have a strange issue to do with a cython function changing the dtype of the elements of a numpy array. Strangely, the dtype is only changed when the input type of the array is actually specified.

I am using Cython==0.29.11, numpy==1.15.4, python 3.6, on Ubuntu 18.04.

# cyth.pyx
cimport numpy as np

def test(x):
    print(type(x[0]))

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    print(type(x[0]))

Now cythonising this file and using the functions:

>>> from cyth import test, test_np
>>> import numpy as np
>>> a = np.array([1, 2], dtype=np.uint32)
>>> test(a)
<class 'numpy.uint32'>
>>> test_np(a)
<class 'int'>

So test works as expected, printing the type of the first element in the input array - a uint32. But test_np, which actually ensures that the type of the incoming array is uint32, now shows a regular Python int as the type of the first element.

Even trying to force the element to to be of the right type does not work, i.e. using:

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    cdef np.uint32_t el
    el = x[0]
    print(type(el))

still results in

>>> test_np(a)
<class 'int'>

Any help in understanding this discrepancy would be greatly appreciated.

1 Answer 1

3

Cython doesn't change the type of the array, but returns an element of a slightly different type.

The data in numpy-array is stored as contiguous field of 32bit unsigned integers. Accessing x[0] means creating a Python-object (because Python interpreter cannot handle raw C-ints) - numpy has a dedicated wrapper class for every numpy-dtype and returns an np.uint32-object.

Cython on the other hand, maps all C integer types (e.g. long, int and so on) simple onto Python-integer (which make sense).

Now, when numpy is cimported, x[0] no longer means using __getitem__() of the numpy-array (which would return np.uint32-object) but a C-integer (in this case unsigned 4byte), which is converted to a Python-integer, because "return XXX" means in a def function means the result must be a Python-object.

Which does mean, that the array has a different type - the types are mapped differently when converted to Python-object by Cython.


If you want to access data as np.uint32-objects, you could call __getitem__ instead of [..] ([..] is translated by Cython as access to raw-C-data):

%%cython
cimport numpy as np

def test_np(np.ndarray[np.uint32_t, ndim=1] x):
    print(type(x[0]))                     # int
    print(type(x.__getitem__(0)))         # numpy.uint32

When you use typed memory views rather than ndarray, then calling __getitem__ directly will return a Python-integer __getitem__ of the memory view doesn't call __getitem__ of the underlying ndarray but accesses the data on the C-level. To call __getitem__ of the underlying object for memory view:

def test_np(np.uint32_t[:] x):
    print(type(x[0]))
    print(type(x.base.__getitem__(0))) # instead of x.__getitem__(0)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.