👼🏻 🍑 🏂🏾 OpenStack Neutron PTG Review June 2020 📖 💬 ♌️

PTG (Project Team Gathering) is an event where development teams meet to discuss current tasks, statuses and plans. PTG spun off from the mainstream OpenStack summit a few years ago.

PTG was first held online through Zoom and Jitsi Meet. However, the combination of picture and sound in the meeting made this change completely unnoticeable, especially against the backdrop of the now familiar IRC team meetings.

The three-hour Neutron sessions ran from Tuesday to Friday. The main meeting minutes are published on OpenStack Etherpad and on the OpenStack mailing list. The agenda of the event was formed based on the proposals of the Neutron developers, and the meeting schedule was prepared by its chairman, PTL (Project Team Lead) of the Neutron Slawek Kaplonski team.

In this article I will talk about 3 topics that I think deserve attention, and require a little explanation.

OVN

There was a lot of talk about OVN on this PTG, which is not surprising since most of the core team members represent RedHat, the main contributor to OVN.

What is OVN?

Open source L2 / L3 network virtualization for Open vSwitch (OVS):
- Logic switches
- Logical IPv4 and IPv6 routers
- L2 / L3 / L4 ACLs (Security Groups)
- Multiple tunnel overlays (Geneve, STT, and VXLAN)
- Logical load balancers
- TOR-based logical-physical L2 gateways
- Software-based logical-physical L2 / L3 gateways
Works on the same platforms as OVS:
- Linux
- Containers
- DPDK
Integration with:
- OpenStack Neutron
- Docker swarm
- Kubernetes

OVN architecture

“OVN in 75 words.

The Open Virtual Network is operated by the OVS project and developed by the original OVS team. This decision is an attempt to redesign the ML2 / OVS control plane based on years of experience. It is intended for use with OpenStack and Kubernetes. OVN is built on a new architecture that has abandoned the concept of Python agents interacting with the Neutron API service via RabbitMQ in favor of C daemons communicating via OpenFlow and OVSDB. ” - Slawek Kaplonsky, Neutron PTL.

Initially, the Neutron OVN driver was developed as a separate project in Neutron stadium - networking-ovn, and in the release Ussuri was included in the main Neutron repository.

Thus, this solution eliminates the main problem of ML2 / OVS - RabbitMQ, which is an undoubted plus, and in general “OVN's design goal is to have a production-quality implementation that can operate at a significant scale”. However, does OVN support the functionality available when using ML2 / OVS? It seems that this is not entirely true, which became one of the topics of discussion on PTG. As a result, several gaps were highlighted (a complete list is available on the project page). First of all, the developers noted the absence or incomplete support for routed networks, some QoS features, BGP and Availability Zones. Although the OVN team is ready to tackle all of the above, during the meeting they admitted that this had not previously been a priority for them, since internal interests were more important. In addition, the development of ML2 / OVS, of course,does not pause, which means new spaces may appear.

However, in my opinion, the main problem with OVN is that it is not yet widely used and has not been tested on large installations. In addition, there are some questions about High Availability:

One of the main components, ovn-northd, currently only supports active / passive HA mode, active / active is only planned for now
Another central component, ovsdb-server, also only supports active / passive mode

It is possible that the last point is actually outdated, since support for the ovsdb cluster (based on the Raft algorithm) has been added since OVS 2.9, but it is not clear if this was tested in the version with OVN and OpenStack. For example, the associated ticket in openstack-ansible has not yet been closed.

Also of concern is that OVN uses Geneve tunnels instead of VxLANs, which affects MTU settings (Geneve headers are larger than VxLANs) and support for hardware accelerated tunnel processing.

Be that as it may, the project is rapidly gaining momentum and it seems that in a couple of releases OVN should become a basic Neutron plugin. Moreover, during PTG, core team developers agreed to make OVN the default plugin for DevStack.

Where these changes will lead:

OpenStack Neutron CI,
ML2/OVS ( )
Neutron CI , ML2/Linuxbridge ML2/OVS – ,
, core OVN

Regarding the last point, Neutron PTL posted the following message: “The Neutron team believes that OVN and the Neutron OVN driver are built on a modern architecture that provides a better foundation for a simpler, more efficient solution. We are seeing increased engagement in kubernetes-ovn, leading to an expansion of the core OVN community, and we would like OpenStack to take advantage of this investment in OVN from Kubernetes as well.

At the moment, the Neutron OVN driver has gaps in the supported functionality compared to ML2 / OVS, however, our team is trying to close these gaps, and we believe that this driver will be the future for Neutron, and therefore we want to make it the default Neutron ML2 backend for DevStack. "

So far, the reaction to this news is rather positive, although there are still doubts about the transition from VxLAN to Geneve tunnels, how to migrate from ML2 OVS to ML2 OVN, as well as performance and supported functionality.

Application of the new EngineFacade

EngineFacade is a framework on top of sqlalchemy that integrates the database logic used across all OpenStack projects. Several releases ago, it went through refactoring, which led to the appearance of the so-called “new EngineFacade”. The next step was to adapt this framework to OpenStack.

In my opinion, this topic was included in the PTG agenda due to the fact that work on it has been dragging on for several releases and has not yet been completed. The reasons for this development of events are a large amount of necessary changes, some non-trivial problems in the adaptation process and, as it seems to me, a lack of motivation, and therefore human resources. Indeed, why change something that already works and doesn't even give out a bunch of bugs? A fairly detailed answer to this question is outlined in the Mike Bayer specification. Here I will try to give a brief summary of the considerations in support of EngineFacade so that you do not have to read this long text:

The old EngineFacade provides low-level APIs instead of high-level APIs tailored to a specific use case, so this is essentially a factory, not a facade. As a result:
- EngineFacade OpenStack
- , ,
EngineFacade // : reader writer, , .

Sounds simple and logical, so what is the problem with the EngineFacade adaptation then? To be honest, I didn't go into the details very much, but it seems that the main cause of the problems is that in some complex scenarios the old EngineFacade was misused in Neutron and it worked (!), And the new EngineFacade is trying to do everything right, but nevertheless, it breaks working scripts (in my opinion, a fairly typical problem when working with legacy code). Obviously, in this case, you must first correct the logic of these scripts.

In fact, there is not so much left to edit - just one patch, and the core team agreed to jointly solve this problem. Of course, anyone interested can help with the analysis and review!

Neutron-lib

Several topics have been devoted to neutron-lib. To begin with, let me remind you what it is for those who are not heavily involved in the development of Neutron. First, Neutron is not a single project - in fact, it consists of several repositories working on different areas of the OpenStack network under the general name Neutron Stadium, and “neutron” is just one, albeit a major project. The rest of the projects are so-called advanced services (for example, neutron-lbaas, -fwaas, -vpnaas, -dynamic-routing, etc.) and third-party / vendor plugins (for example networking-midonet, -odl, -ovn). This list includes projects that are developed by Neutron PTL and the core team and are directly involved in them on a daily basis. To make this possible, they ensure that the general principles and rules of work are followed throughout the Stadium in all aspects of development - structure,development, code style, testing, documenting, etc. To be honest, today this is not entirely true, and the main burden still falls on the shoulders of the project maintainers.

Before neutron-lib was created, all networking- projects imported all common code - constants, interfaces (abstract base classes), helper functions, and more - from the neutron main repository. Any changes to such code in neutron could break dependent projects. Then, in the Ocata release, the neutron-lib initiative was launched to solve this problem: all common code should now be stored in a separate repository and should be versioned. More specifically, the goals were formulated as follows:

Remove dependency of subprojects from Neutron (i.e. remove direct imports from neutron in subprojects)
Do your homework in Neutron by refactoring the code or redesigning the suboptimal pattern architecture in the appropriate neutron-lib sections

In fact, neutron-lib looks like a win-win option: both the main Neutron and the services of third-party projects should be in the black as a result. What went wrong?

Lack of support

No open-source project can exist without the support of contributors and maintainers - people who are ready to invest their time in working on a project. For neutron-lib, there was a lack of such volunteers, and as a result, the original logic stopped working, i.e. so that all the common code is stored here that could be imported instead of importing neutron. The main maintainer neutron-lib (boden) left the project some time ago. During the PTG, a proposal was made to abandon the idea of porting all common code to neutron-lib, or even to port the neutron-lib code back to neutron. This proposal did not pass for two reasons:

neutron-lib is still widely used
neutron-lib carries some value in that it highlights standard interfaces that cannot be changed so as not to break multiple projects at once

Following the discussion, neutron-lib remains unchanged, but the neutron code relocation and deprecation policy needs to be updated.

Of course, all new code should be shared between neutron and neutron-lib, if possible. And that brings us to the second problem.

Testing problem

Another issue relates to testing during development. If part of a patch in neutron introduces new or changes existing shared code, it should be sent to neutron-lib by rules. This makes the neutron part of the patch dependent on these lib changes. However, the neutron patches are currently being tested on the release version of neutron-lib to verify that they work with the latest release. As a result, such patches will not pass tests in CI.

Going to test all neutron patches with neutron-lib code from the wizard also has some disadvantages. For example, there is no guarantee that the neutron wizard will work with the latest neutron-lib release, which is what end users are using.

Here are the ways to address this issue (thanks to Bence Romsics for the excellent summary):

, , neutron-lib , neutron .
, :
- , “foo” neutron-lib, . neutron , “_foo” TODO , , neutron-lib.
- neutron-lib , neutron, _foo “import _foo” “from neutron-lib import foo”.
In addition, you can run separate checks on CI with both the wizard and the latest neutron-lib release. But only one of them can vote. Simply doubling the number of tasks will put a huge additional load on the OpenStack CI infrastructure.

During the discussion at PTG, three proposals were made:

Use neutron-lib wizard for “Check CI”; use neutron-lib release version for “Gate CI” - however, if the neutron patch passes “Check CI” checks and crashes on “Gate CI”, it will look strange
Don't change anything: it's best to run tests on the neutron-lib release version. For example, this is now done for OSC (OpenStackClient)
Run tests with the neutron-lib wizard and add a periodic task for tests with the neutron-lib release

Final solution: create a new non-voting issue in “Check CI” with neutron-lib from the master branch. Basically, everything remains as it is, but it will be possible to check that a feature that includes changes in neutron and neutron-lib goes through CI before committing it to the master branch.

Hope this article was helpful and helped you better understand where and why Neutron is heading.

Thank you for attention!

OpenStack Neutron PTG Review June 2020

OVN

OVN architecture

Application of the new EngineFacade

Neutron-lib

Lack of support

Testing problem

More articles: