7

Using Pybind11, I am trying to pass a numpy array to c++ into a std::vector, multiply it by 2, and return this std::vector to python as a numpy array.

I have achieved the first step but the third is doing some strange things. For passing it back I have used: py::array ret = py::cast(vect_arr); By strange things I mean that the vector returned in Python doesn't have the correct dimensions nor the correct order.

As example, I have as array:

[[ 0.78114362  0.06873818  1.00364053  0.93029671]
 [ 1.50885413  0.38219005  0.87508337  2.01322396]
 [ 2.19912915  2.47706644  1.16032292 -0.39204517]]

and the code returns:

array([[ 1.56228724e+000,  3.01770826e+000,  4.39825830e+000,
         5.37804299e+161],
       [ 1.86059342e+000,  4.02644793e+000, -7.84090347e-001,
         1.38298992e-309],
       [ 1.75016674e+000,  2.32064585e+000,  0.00000000e+000,
         1.01370255e-316]])

I have read the documentation but I am having trouble understanding most of it.

Here an example to try:

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <pybind11/stl.h>
#include <Python.h>
namespace py = pybind11;
py::module nn = py::module::import("iteration");


py::array nump(py::array arr){

    auto arr_obj_prop = arr.request();
    //initialize values
    double *vals = (double*) arr_obj_prop.ptr;

    unsigned int shape_1 = arr_obj_prop.shape[0];
    unsigned int shape_2 = arr_obj_prop.shape[1];


    std::vector<std::vector <double>> vect_arr( shape_1, std::vector<double> (shape_2));

    for(unsigned int i = 0; i < shape_1; i++){
      for(unsigned int j = 0; j < shape_2; j++){
        vect_arr[i][j] = vals[i*shape_1 + j*shape_2] * 2;
      }
    }   

    py::array ret =  py::cast(vect_arr); //py::array(vect_arr.size(), vect_arr.data());
    return ret;

}

PYBIND11_MODULE(iteration_mod, m) {

    m.doc() = "pybind11 module for iterating over generations";

    m.def("nump", &nump,
      "the function which loops over a numpy array");
}

And the Python code:

import numpy as np
import iteration_mod as i_mod

class iteration(object):
    def __init__(self):
        self.iterator = np.random.normal(0,1,(3,4))

    def transform_to_dict(self):
        self.dict = {}
        for i in range(self.iterator.shape[0]):
            self.dict["key_number_{}".format(i)] = self.iterator[i,:]
        return self.dict

    def iterate_iterator(self):
        return i_mod.nump(self.iterator)

    def iterate_dict(self):
        return i_mod.dict(self)

a = iteration()
print(a.iterator)
print(a.iterate_iterator())

All of this compiled with: c++ -O3 -Wall -fopenmp -shared -std=c++11 -fPIC python3 -m pybind11 --includes iteration_mod.cpp -o iteration_mod.so

1 Answer 1

10

std::vector<std::vector<double>> does not have the memory layout of a 2D builtin array, so that py::array(vect_arr.size(), vect_arr.data()); will not work.

It looks like the py::cast does do the proper copy conversions and propagates the values from the vector to a new numpy array, but this line:

vect_arr[i][j] = vals[i*shape_1 + j*shape_2] * 2;

is not right. It should be:

vect_arr[i][j] = vals[i*shape_2 + j] * 2;
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks god you are here ! You have answered to my question again. Thanks a lot
Just perhpas one last Question: do you know if there is a more synthetic and faster way to initialize an std::vector ? Here I loop over all rows and columns, but I am wondering if is there is not eventually a py:array class method or something like that.
That code isn't slow: std::vector is one of the most optimized corners of any C++ compiler and since the vectors do not leave the function, even changing their allocation is fair game here. OTOH, there is real pain in the two copies: from the input numpy array into the vector and then in the py::cast from the vector into the newly created output array. If you want some speed-up, you should allocate a properly sized py:array for output, get its pointer, and write into that memory directly. Even there, though, the call to allocate a new array is way more expensive than the loop/writing.
I guess the new version is also incorrect, It should be vect_arr[i][j] = vals[i*shape_2 + j] * 2; every row has shape_2 colomns.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.