PyBind11 to Integrate Compiled C++ with Python

Setup from base OS (Ubuntu 19.04 Image)

Utilizing Linode as a VPS provider, I’ve chosen to spin up a new Nanode instance (as done in previous posts).

For this setup, I’ll choose not to use Docker (as I haven’t covered its setup as yet, and it’s not as necessary as we’re just demonstrating the process).


Installations

The Python Setup:

The base image of Ubuntu (19.04) provided by Linode comes with Python 3.7.3 installed as of currently (2019-08-11). It does not come with pip installed, nor should we install it - as we shouldn’t pollute the system’s Python modules that are installed. Instead we should start all projects with virtual environments. Hence we ssh into our new image and install venv for Python 3:

:~$ sudo apt install python3-venv

We also need the development libraries for Python (to acquire Python.h which is used in compilation of C++ files)

:~$ sudo apt install python3-dev

We then create our environment with this tool that we installed:

:~$ python3 -m venv cpp-test-env

A directory should be created with the new environment. We then need to activate the environment:

:~$ source cpp-test-env/bin/activate

Doing a pip freeze (should confirm a near-empty environment):

:~$ pip freeze
pkg-resources==0.0.0

Now to acquire the pybind11 package for our contained environment:

:~$ pip install pybind11

The C++ Requirements

Acquiring a C++ compiler (to compile our C++-based modules), we can use GNU’s g++ or gcc:

:~$ sudo apt install g++

Sample C++ code to be compiled

Demonstrating Porting a Class to Python:

filename: vector_list.cpp

//filename: vector_list.cpp
#include <vector>
#include <pybind11/pybind11.h> // always import this
#include <pybind11/stl.h>  // import this when using STL containers (like vector)

namespace py = pybind11;

class VList {
    public:
        std::vector<long long> list;
        unsigned int len;
        VList(int s) {
            len = s;
            for(int i=0; i<s; i++) list.push_back(0);
        }
        void append(long long val) {
            len++;
            list.push_back(val);
        }
        void mod_list() {
            for(unsigned int i=0; i<len; i++) list[i] = (list[i]*list[i])-3;
        }
        void mod_index(unsigned int ind, long long val) {
            list[ind] = val;
        }
};

// below is the binding code! We specify the class and all its methods
// vector_list should be same name as file (which we'll name vector_list.cpp)

PYBIND11_MODULE(vector_list, m) {  //vector_list being the file name (without .cpp)
    py::class_<VList>(m, "VList")
        .def(py::init<int>())
        .def("append", &VList::append)
        .def("mod_list", &VList::mod_list)
        .def("mod_index", &VList::mod_index)
        .def_readwrite("list", &VList::list);  // accessing our vector 'list'
}

We can compile the above file into a cpython module by running the following command:

:~$ g++ -O3 -Wall -shared -std=c++11 -fPIC `python3 -m pybind11 --includes` vector_list.cpp -o vector_list`python3-config --extension-suffix`

This outputs the file to the current directory: vector_list.cpython-37m-x86_64-linux-gnu.so

We’ve just made our python library build with C++, using a C++ class!

Now to test our code we can open the python interpreter in that current directory with the .so file:

(cpp-test-env) root@localhost:~# python
Python 3.7.3 (default, Apr  3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> from vector_list import VList

>>> dir(VList)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'list', 'mod_index', 'mod_list']
>>> x = VList(10)
>>> x
<vector_list.VList object at 0x7f574c3585e0>
>>> x.list
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

>>> x.mod_index(3, 8)
>>> x.list
[0, 0, 0, 8, 0, 0, 0, 0, 0, 0]

>>> x.list[3]
8
>>> x.list[3] = 4
>>> x.list
[0, 0, 0, 8, 0, 0, 0, 0, 0, 0]

AND WE’RE -

Ok so it would seem in that last instruction we can’t modify the vector_list object like how we would treat a python list (hence the need for ‘helper’ methods which we created in our VList class e.g. mod_index).


We can overcome this with OPAQUE TYPES. This provides more python bindings that happen behind the scenes to let python interact with the C++’s STL in a more intuitive way like typical python data structures (being able to modify properties by accessing its index for example).

We can actually make most of vector_list.cpp redundant by just allowing Python direct access to C++’s vector class from the STL. Let’s make another .cpp file to achieve this.

We’ll call this file vector_opaque_list.cpp

//filename: vector_opaque_list.cpp
#include <vector>
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <pybind11/stl_bind.h>

namespace py = pybind11;

PYBIND11_MAKE_OPAQUE(std::vector<int>);

PYBIND11_MODULE(vector_opaque_list, m) {
    py::bind_vector< std::vector<int> >(m, "VectorInt");
}

Essentially this is just code that provides an interface to a std::vector<int> object and gives it the name VectorInt (the object name to be imported in Python).

Time to compile:

:~$ g++ -O3 -Wall -shared -std=c++11 -fPIC `python3 -m pybind11 --includes` vector_opaque_list.cpp -o vector_opaque_list`python3-config --extension-suffix`

Output file to current directory: vector_opaque_list.cpython-37m-x86_64-linux-gnu.so

Calling the Python interpreter to the current directory now:

(cpp-test-env) root@localhost:~# python
Python 3.7.3 (default, Apr  3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> from vector_opaque_list import VectorInt

>>> dir(VectorInt)
['__bool__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__pybind11_module_local_v3__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'insert', 'pop', 'remove']

>>> VI = VectorInt()
>>> VI
VectorInt[]
>>> VI.append(5)
>>> VI.append(6)
>>> VI.append(7)
>>> VI
VectorInt[5, 6, 7]

>>> VI.pop()
7
>>> VI
VectorInt[5, 6]

>>> VI[1] = 4
>>> VI
VectorInt[5, 4]

We’re able to modify the object using typical Python syntax for modifying a mutable date structure!

We’ve achieved mixing in C++ code into our Python solutions as modules to import.

I guess the next step is to benchmark some calculations and see what performance benefits we can extract by doing this.