Developing a python module to make production happy

Hello! I represent the development team at the non-profit CyberDuckNinja. We create and support a whole family of products that make it easier to develop backend applications and machine learning services.



Today I would like to touch on the topic of integrating Python into C ++.







It all started with a call from a friend at two in the morning, who complained: "We have production under load ..." In the conversation, it turned out that the production code was written using ipyparallel (a Python package that allows parallel and distributed computations) to calculate the model and getting results online. We decided to understand the architecture of ipyparallel and perform profiling under load.



It immediately became clear that all the modules of this package were designed perfectly, but most of the time is spent on networking, json parsing and other intermediate actions.

Upon a detailed study of ipyparallel, it turned out that the entire library consists of two interacting modules:



  • Ipcontroler, which is responsible for the control and scheduling of tasks,
  • Engine, which is the executor of the code.


A nice feature turned out to be that these modules interact through pyzmq. Thanks to the good engine architecture, we managed to replace the networking implementation with our solution built on cppzmq. This replacement opens up endless development scope: the counterpart can be written in the C ++ part of the application.



This made the engine pools theoretically even faster, but still did not solve the problem of integrating libraries into the Python code. If you have to do too much to integrate your library, then such a solution will not be in demand and will remain on the shelf. The question remained how to natively integrate our developments into the current engine codebase.



We needed some reasonable criteria in order to understand which approach to choose: ease of development, API declaration only inside C ++, no additional wrappers inside Python, or native use of all the power of libraries. And in order not to get confused in the native (and not so) ways of dragging through C ++ code in Python, we did a little research. At the beginning of 2019, four popular ways to extend Python could be found on the Internet:



  1. Ctypes
  2. CFFI
  3. Cython
  4. CPython API


We have considered all the integration options.



1. Ctypes



Ctypes is a Foreign Function Interface that allows you to load dynamic libraries that export a C interface. It can be used to use C libraries from Python, for example, libev, libpq.



For example, there is a library written in C ++ with an interface:



extern "C"
{
    Foo* Foo_new();
    void Foo_bar(Foo* foo);
}


We write a wrapper to it:



import ctypes

lib = ctypes.cdll.LoadLibrary('./libfoo.so')

class Foo:
    def __init__(self) -> None:
        super().__init__()

        lib.Foo_new.argtypes = []
        lib.Foo_new.restype = ctypes.c_void_p
        lib.Foo_bar.argtypes = []
        lib.Foo_bar.restype = ctypes.c_void_p

        self.obj = lib.Foo_new()

    def bar(self) -> None:
        lib.Foo_bar(self.obj)


We draw conclusions:



  1. Inability to interact with the interpreter API. Ctypes is a way of interacting with C libraries on the Python side, but it does not provide a way for C / C ++ code to interact with Python.
  2. Exporting a C-style interface. types can interact with ABI libraries in this style, but any other language must export its variables, functions, methods through a C wrapper.
  3. The need to write wrappers. They have to be written both on the C ++ code side for ABI compatibility with C, and on the Python side to reduce the amount of boilerplate code.


types doesn't suit us, we try the next method - CFFI.



2. CFFI



CFFI is similar to Ctypes, but has some additional features. Let's demonstrate an example with the same library:



import cffi

ffi = cffi.FFI()

ffi.cdef("""
    Foo* Foo_new();
    void Foo_bar(Foo* foo);
""")

lib = ffi.dlopen("./libfoo.so")

class Foo:
    def __init__(self) -> None:
        super().__init__()

        self.obj = lib.Foo_new()

    def bar(self) -> None:
        lib.Foo_bar(self.obj)


We draw conclusions:



CFFI still has the same disadvantages, except that the wrappers become a little fatter, since you need to tell the library the definition of its interface. CFFI is also not suitable, let's move on to the next method - Cython.



3. Cython



Cython is a sub / meta programming language that allows you to write extensions in a mixture of C / C ++ and Python and load the result as a dynamic library. This time there is a library written in C ++ and having an interface:



#ifndef RECTANGLE_H
#define RECTANGLE_H

namespace shapes {
    class Rectangle {
        public:
            int x0, y0, x1, y1;
            Rectangle();
            Rectangle(int x0, int y0, int x1, int y1);
            ~Rectangle();
            int getArea();
            void getSize(int* width, int* height);
            void move(int dx, int dy);
    };
}

#endif


Then we define this interface in Cython language:



cdef extern from "Rectangle.cpp":
    pass

# Declare the class with cdef
cdef extern from "Rectangle.h" namespace "shapes":
    cdef cppclass Rectangle:
        Rectangle() except +
        Rectangle(int, int, int, int) except +
        int x0, y0, x1, y1
        int getArea()
        void getSize(int* width, int* height)
        void move(int, int)


And we write a wrapper to it:



# distutils: language = c++

from Rectangle cimport Rectangle

cdef class PyRectangle:
    cdef Rectangle c_rect

    def __cinit__(self, int x0, int y0, int x1, int y1):
        self.c_rect = Rectangle(x0, y0, x1, y1)

    def get_area(self):
        return self.c_rect.getArea()

    def get_size(self):
        cdef int width, height
        self.c_rect.getSize(&width, &height)
        return width, height

    def move(self, dx, dy):
        self.c_rect.move(dx, dy)

    # Attribute access
    @property
    def x0(self):
        return self.c_rect.x0

    @x0.setter
    def x0(self, x0):
        self.c_rect.x0 = x0

    # Attribute access
    @property
    def x1(self):
        return self.c_rect.x1

    @x1.setter
    def x1(self, x1):
        self.c_rect.x1 = x1

    # Attribute access
    @property
    def y0(self):
        return self.c_rect.y0

    @y0.setter
    def y0(self, y0):
        self.c_rect.y0 = y0

    # Attribute access
    @property
    def y1(self):
        return self.c_rect.y1

    @y1.setter
    def y1(self, y1):
        self.c_rect.y1 = y1


Now we can use this class from regular Python code:



import rect
x0, y0, x1, y1 = 1, 2, 3, 4
rect_obj = rect.PyRectangle(x0, y0, x1, y1)
print(dir(rect_obj))


We draw conclusions:



  1. When using Cython, you still have to write wrapper code on the C ++ side, but you no longer need to export the C-style interface.
  2. You still can't interact with the interpreter.


The last way remains - CPython API. We try it.



4. CPython API



CPython API - API that allows you to develop modules for the Python interpreter in C ++. Your best bet is pybind11, a high-level C ++ library that makes working with the CPython API convenient. With its help, you can easily export functions, classes, convert data between python memory and native memory in C ++.



So, let's take the code from the previous example and write a wrapper to it:



PYBIND11_MODULE(rect, m) {
    py::class_<Rectangle>(m, "PyRectangle")
        .def(py::init<>())
        .def(py::init<int, int, int, int>())
        .def("getArea", &Rectangle::getArea)
        .def("getSize", [](Rectangle &rect) -> std::tuple<int, int> {
            int width, height;

            rect.getSize(&width, &height);

            return std::make_tuple(width, height);
        })
        .def("move", &Rectangle::move)
        .def_readwrite("x0", &Rectangle::x0)
        .def_readwrite("x1", &Rectangle::x1)
        .def_readwrite("y0", &Rectangle::y0)
        .def_readwrite("y1", &Rectangle::y1);
}


We wrote the wrapper, now it needs to be compiled into a binary library. We need two things: a build system and a package manager. Let's take CMake and Conan for these purposes, respectively.



To make the build on Conan work, you need to install Conan itself in a suitable way:



pip3 install conan cmake


and register additional repositories:



conan remote add bincrafters https://api.bintray.com/conan/bincrafters/public-conan
conan remote add cyberduckninja https://api.bintray.com/conan/cyberduckninja/conan


Let's describe the project dependencies for the pybind library in the conanfile.txt file:



[requires]
pybind11/2.3.0@conan/stable

[generators]
cmake


Let's add the CMake file. Pay attention to the included integration with Conan - when CMake is executed, the conan install command will be run, which installs the dependencies and generates CMake variables with information about the dependencies:



cmake_minimum_required(VERSION 3.17)

set(project rectangle)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED YES)
set(CMAKE_CXX_EXTENSIONS OFF)

	if (NOT EXISTS "${CMAKE_BINARY_DIR}/conan.cmake")
    	message(STATUS "Downloading conan.cmake from https://github.com/conan-io/cmake-conan")
    	file(DOWNLOAD "https://raw.githubusercontent.com/conan-io/cmake-conan/v0.15/conan.cmake" "${CMAKE_BINARY_DIR}/conan.cmake")
	endif ()

	set(CONAN_SYSTEM_INCLUDES "On")

	include(${CMAKE_BINARY_DIR}/conan.cmake)

	conan_cmake_run(
        	CONANFILE conanfile.txt
        	BASIC_SETUP
        	BUILD missing
        	NO_OUTPUT_DIRS
	)

find_package(Python3 COMPONENTS Interpreter Development)
include_directories(${PYTHON_INCLUDE_DIRS})
include_directories(${Python3_INCLUDE_DIRS})
find_package(pybind11 REQUIRED)

pybind11_add_module(${PROJECT_NAME} main.cpp )

target_include_directories(
    	${PROJECT_NAME}
    	PRIVATE
    	${NUMPY_ROOT}/include
    	${PROJECT_SOURCE_DIR}/vendor/General_NetSDK_Eng_Linux64_IS_V3.051
    	${PROJECT_SOURCE_DIR}/vendor/ffmpeg4.2.1
)

target_link_libraries(
    	${PROJECT_NAME}
    	PRIVATE
    	${CONAN_LIBS}
)


All preparations are complete, let's collect:



cmake . -DCMAKE_BUILD_TYPE=Release 
cmake --build . --parallel 2


We draw conclusions:



  1. We received the assembled binary library, which can be subsequently loaded into the Python interpreter by its means.
  2. It has become much easier to export the code to Python compared to the methods above, and the wrapping code has become more compact and written in the same language.


One of the cpython / pybind11 features is loading, getting or executing a function from python runtime while in C ++ runtime and vice versa.



Let's take a look at a simple example:



#include <pybind11/embed.h>  //     

namespace py = pybind11;

int main() {
    py::scoped_interpreter guard{}; //  python vm
    py::print("Hello, World!"); //     Hello, World!
}


By combining the ability to embed a python interpreter in a C ++ application and the Python modules engine, we came up with an interesting approach whereby the ipyparalles engine code does not feel the substitution of components. For applications, we chose an architecture in which life and event cycles begin in C ++ code, and only then the Python interpreter starts within the same process.



To understand, let's take a look at how our approach works:



#include <pybind11/embed.h>

#include "pyrectangle.hpp" //  ++  rectangle

using namespace py::literals;
//            rectangle
constexpr static char init_script[] = R"__(
    import sys

    sys.modules['rect'] = rect
)__";
//             rectangle
constexpr static char load_script[] = R"__(
    import sys, os
    from importlib import import_module

    sys.path.insert(0, os.path.dirname(path))
    module_name, _ = os.path.splitext(path)
    import_module(os.path.basename(module_name))
)__";

int main() {
    py::scoped_interpreter guard; //  
    py::module pyrectangle("rect");    

    add_pyrectangle(pyrectangle); //  
    py::exec(init_script, py::globals(), py::dict("rect"_a = pyrectangle)); //        Python.
    py::exec(load_script, py::globals(), py::dict("path"_a = "main.py")); //  main.py

    return 0;
}


In the above example, the pyrectangle module is forwarded into the Python interpreter and made available for import as rect. Let's demonstrate with an example that nothing has changed for the "custom" code:



from pprint import pprint

from rect import PyRectangle

r = PyRectangle(0, 3, 5, 8)

pprint(r)

assert r.getArea() == 25

width, height = r.getSize()

assert width == 5 and height == 5


This approach is characterized by high flexibility and many points of customization, as well as the ability to legally manage Python memory. But there are problems - the cost of an error is much higher than in other options, and you need to be aware of this risk.



Thus, ctypes and CFFI are not suitable for us due to the need to export C-style library interfaces, and also due to the need to write wrappers on the Python side and, ultimately, use the CPython API if embedding is necessary. Cython is free of its export flaw, but retains all other flaws. Pybind11 only supports embedding and writing wrappers on the C ++ side. It also has extensive capabilities for manipulating data structures and calling Python functions and methods. As a result, we settled on pybind11 as a high-level C ++ wrapper for the CPython API.



By combining the use of embed python inside a C ++ application with the module mechanism for fast data forwarding and reusing the ipyparallel engine code base, we got a rocketjoe_engine. It is identical in mechanics to the original and works faster by reducing castes for network interactions, json processing and other intermediate actions. Now this allows my friend to keep loads on production, for which I received the first star in the GitHub project .



Conan, Russian Python Week C++, Python Conan .



Russian Python Week 4 — 14 17 . , Python: Python- . , Python.

.



All Articles