PyBind11 to Integrate Compiled C++ with Python
Setup from base OS (Ubuntu 19.04 Image)
Utilizing Linode as a VPS provider, I’ve chosen to spin up a new Nanode instance (as done in previous posts).
For this setup, I’ll choose not to use Docker (as I haven’t covered its setup as yet, and it’s not as necessary as we’re just demonstrating the process).
Installations
The Python Setup:
The base image of Ubuntu (19.04) provided by Linode comes with Python 3.7.3 installed as of currently (2019-08-11). It does not come with pip installed, nor should we install it - as we shouldn’t pollute the system’s Python modules that are installed. Instead we should start all projects with virtual environments. Hence we ssh into our new image and install venv for Python 3:
:~$ sudo apt install python3-venv
We also need the development libraries for Python (to acquire Python.h which is used in compilation of C++ files)
:~$ sudo apt install python3-dev
We then create our environment with this tool that we installed:
:~$ python3 -m venv cpp-test-env
A directory should be created with the new environment. We then need to activate the environment:
:~$ source cpp-test-env/bin/activate
Doing a pip freeze (should confirm a near-empty environment):
:~$ pip freeze
pkg-resources==0.0.0
Now to acquire the pybind11 package for our contained environment:
:~$ pip install pybind11
The C++ Requirements
Acquiring a C++ compiler (to compile our C++-based modules), we can use GNU’s g++ or gcc:
:~$ sudo apt install g++
Sample C++ code to be compiled
Demonstrating Porting a Class to Python:
filename: vector_list.cpp
//filename: vector_list.cpp
#include <vector>
#include <pybind11/pybind11.h> // always import this
#include <pybind11/stl.h> // import this when using STL containers (like vector)
namespace py = pybind11;
class VList {
public:
std::vector<long long> list;
unsigned int len;
VList(int s) {
len = s;
for(int i=0; i<s; i++) list.push_back(0);
}
void append(long long val) {
len++;
list.push_back(val);
}
void mod_list() {
for(unsigned int i=0; i<len; i++) list[i] = (list[i]*list[i])-3;
}
void mod_index(unsigned int ind, long long val) {
list[ind] = val;
}
};
// below is the binding code! We specify the class and all its methods
// vector_list should be same name as file (which we'll name vector_list.cpp)
PYBIND11_MODULE(vector_list, m) { //vector_list being the file name (without .cpp)
py::class_<VList>(m, "VList")
.def(py::init<int>())
.def("append", &VList::append)
.def("mod_list", &VList::mod_list)
.def("mod_index", &VList::mod_index)
.def_readwrite("list", &VList::list); // accessing our vector 'list'
}
We can compile the above file into a cpython module by running the following command:
:~$ g++ -O3 -Wall -shared -std=c++11 -fPIC `python3 -m pybind11 --includes` vector_list.cpp -o vector_list`python3-config --extension-suffix`
This outputs the file to the current directory: vector_list.cpython-37m-x86_64-linux-gnu.so
We’ve just made our python library build with C++, using a C++ class!
Now to test our code we can open the python interpreter in that current directory with the .so file:
(cpp-test-env) root@localhost:~# python
Python 3.7.3 (default, Apr 3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from vector_list import VList
>>> dir(VList)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'list', 'mod_index', 'mod_list']
>>> x = VList(10)
>>> x
<vector_list.VList object at 0x7f574c3585e0>
>>> x.list
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> x.mod_index(3, 8)
>>> x.list
[0, 0, 0, 8, 0, 0, 0, 0, 0, 0]
>>> x.list[3]
8
>>> x.list[3] = 4
>>> x.list
[0, 0, 0, 8, 0, 0, 0, 0, 0, 0]
AND WE’RE -
Ok so it would seem in that last instruction we can’t modify the vector_list object like how we would treat a python list (hence the need for ‘helper’ methods which we created in our VList class e.g. mod_index).
We can overcome this with OPAQUE TYPES. This provides more python bindings that happen behind the scenes to let python interact with the C++’s STL in a more intuitive way like typical python data structures (being able to modify properties by accessing its index for example).
We can actually make most of vector_list.cpp redundant by just allowing Python direct access to C++’s vector class from the STL. Let’s make another .cpp file to achieve this.
We’ll call this file vector_opaque_list.cpp
//filename: vector_opaque_list.cpp
#include <vector>
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <pybind11/stl_bind.h>
namespace py = pybind11;
PYBIND11_MAKE_OPAQUE(std::vector<int>);
PYBIND11_MODULE(vector_opaque_list, m) {
py::bind_vector< std::vector<int> >(m, "VectorInt");
}
Essentially this is just code that provides an interface to a std::vector<int> object and gives it the name VectorInt (the object name to be imported in Python).
Time to compile:
:~$ g++ -O3 -Wall -shared -std=c++11 -fPIC `python3 -m pybind11 --includes` vector_opaque_list.cpp -o vector_opaque_list`python3-config --extension-suffix`
Output file to current directory: vector_opaque_list.cpython-37m-x86_64-linux-gnu.so
Calling the Python interpreter to the current directory now:
(cpp-test-env) root@localhost:~# python
Python 3.7.3 (default, Apr 3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from vector_opaque_list import VectorInt
>>> dir(VectorInt)
['__bool__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__pybind11_module_local_v3__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'insert', 'pop', 'remove']
>>> VI = VectorInt()
>>> VI
VectorInt[]
>>> VI.append(5)
>>> VI.append(6)
>>> VI.append(7)
>>> VI
VectorInt[5, 6, 7]
>>> VI.pop()
7
>>> VI
VectorInt[5, 6]
>>> VI[1] = 4
>>> VI
VectorInt[5, 4]
We’re able to modify the object using typical Python syntax for modifying a mutable date structure!
We’ve achieved mixing in C++ code into our Python solutions as modules to import.
I guess the next step is to benchmark some calculations and see what performance benefits we can extract by doing this.