Using pointers to numpy array data attribute

Question

I'm trying to solve the bottleneck in my application, which is an elementwise sum of two matrices.

I'm using NumPy and Cython. I have a cdef class with a matrix attribute. Since Cython still doesn't support buffer arrays in class attributes, I followed this and tried to use a pointer to the data attribute of the matrix. The thing is, I'm sure I'm doing something wrong, as the results indicate.

What I tried to do is more or less the following:

cdef class the_class:
    cdef np.ndarray the_matrix
    cdef float_t* the_matrix_p

    def __init__(self):
        the_matrix_p = <float_t*> self.the_matrix.data

    cpdef the_function(self):
        other_matrix = self.get_other_matrix()


        the_matrix_p += other_matrix.data

So, what's the problem? What error are you getting?

juniper-
– juniper-

2013-01-25 12:54:45 +00:00
Commented Jan 25, 2013 at 12:54 — juniper-
– juniper-, Commented Jan 25, 2013 at 12:54

Jaime · Accepted Answer · 2013-01-25 18:00:37Z

1

I have serious doubt that adding two numpy arrays is a bottleneck that you can solve rewriting things in C. See the follwing code, that uses scipy.weave:

import numpy as np
from scipy.weave import inline

a = np.random.rand(10000000)
b = np.random.rand(10000000)
c = np.empty((10000000,))

def c_sum(a, b, c) :
    length = a.shape[0]
    code = '''
           for(int j = 0; j < length; j++)
           {
               c[j] = a[j] + b[j];
           }
           '''
    inline(code, ['a', 'b', 'c', 'length'])

Once you run c_sum(a, b, c) once to get the C code compiled, these are the timings I get:

In [12]: %timeit c_sum(a, b, c)
10 loops, best of 3: 33.5 ms per loop

In [16]: %timeit np.add(a, b, out=c)
10 loops, best of 3: 33.6 ms per loop

So it seems you are looking at something of a .3% performance improvement, if the timing differences are not simply random noise, on an operation that takes a handful of ms when working on arrays of ten million elements. If it really is a bottleneck, this is hardly going to solve it.

answered Jan 25, 2013 at 18:00

Jaime

67.7k19 gold badges128 silver badges164 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

erickrf Over a year ago

Yeah, I think you are right. After a couple of measurements, I came to the conclusion that my code is as fast as it can be in Python.

Community · Accepted Answer · 2017-05-23 12:03:27Z

0

Try compiling ATLAS and recompiling numpy after that. This won't probably help with addition, but you can have really nice performance boost with more complicated matrix operations (if you use such, of course).

Check out this simple benchmark. If your results fall too far from those given in the post, maybe your numpy is not linked against some optimized BLAS implementation.

edited May 23, 2017 at 12:03

CommunityBot

11 silver badge

answered Jan 28, 2013 at 0:33

dmytro

1,30310 silver badges21 bronze badges

Collectives™ on Stack Overflow

Using pointers to numpy array data attribute

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related