Huawei ADN: Industry's First Self-Managed Layer 3 Network

What is an autonomously managed network and how is it different from SDN? Huawei worked with consulting firm IDC to examine criteria for evaluating network infrastructure in terms of its ability to support its own operation without the help of an administrator.







How do customers want their data center network infrastructure? It must, of course, be efficient, reliable and easy to maintain. It would be wonderful if the network set up and maintained itself. Modern SDN controllers can do more and more, but how to assess the level of their automation? How to classify this autonomy?



To answer these questions, we turned to the consulting company IDC and asked her to conduct a study, according to the results of which it would be possible to understand how to characterize the autonomy of management of a particular network and how to evaluate the effectiveness of such an implementation. Colleagues from IDC responded to our proposal and came to interesting conclusions.







It is worth starting with the context, namely with total digitalization, the waves of which are sweeping around the world. It requires modernization of both infrastructure and workflows. And the driving force behind this transformation is cloud computing.



In the meantime, you shouldn't think of the cloud as just a place to run workloads. It is also a special approach to work, implying a high level of automation. According to IDC analysts, we are entering an "era of multitude of innovations." Companies are investing in technologies such as artificial intelligence, the Internet of Things, blockchain, and natural interfaces. But the ultimate goal is precisely the autonomy of systems and infrastructures. It is in this context that the prospects for the development of data center networks should be assessed.







The diagram shows the process of network automation, which is divided into several sequential stages. It starts with a command line interface and scripting. The next step is to introduce network factories to improve speed and performance. Next comes the time for SDN controllers and virtualization tools. At this stage, tools for orchestration and automation of data center networks are also being implemented.



The move to intent-based networking is a new dimension. But the goal of this progress is to create a fully autonomous network controlled by artificial intelligence. All market participants consider this problem in one way or another.



What is network autonomy and how to evaluate it? IDC has proposed a six-tier model that allows you to accurately assign a specific solution to a particular level of autonomy.



  • Level 0. At this stage, the management of the network is carried out only through manual processes throughout the entire life cycle of the network. The network is not automated.
  • Level 1. Network management is still largely manual throughout the network lifecycle.
  • Level 2. In some scenarios, partial automation appears that is combined with standard analysis and policy management tools.
  • Level 3. "Conditional Automation". The system is already able to issue recommendations and instructions, accepted or rejected by the operator.
  • Level 4. . . .
  • Level 5. . , .








What are the main challenges facing a data center innovator? IDC's data, compiled from interviews with IT experts, ranked first and second in alignment of network automation with compute and storage automation and flexibility, which is the ability of the network to support mixed workloads and environments.



In third place is the problem of automating network infrastructure, which, as most often happens, is assembled from products of various vendors. This requires a management tool that can pull together the entire zoo of solutions and make it work in accordance with the required level of autonomy. At the same time, 90% of those surveyed agree that achieving network autonomy is the goal of their organizations.



IDC research shows that autonomous network management is a hot trend, in which up to half of all companies developing their IT infrastructure are involved in one way or another.







Let's take a financial sector company as an example of digital transformation. Over the past year, offline sales have declined dramatically, and financial institutions have been among the first to respond to this.



Companies quickly translated much of their activity into apps, organizing digital sales in them. This made it possible to compensate for the drop in the offline channel in a short time and save revenue. At the same time, automation made it possible to minimize the level of errors made by company employees and significantly speed up a significant part of business processes.







At the same time, innovations in customer service have led to an increase in the complexity of the IT infrastructure and an increase in the frequency of changes made to it. Up to 50% of the complex problems currently registered in data centers are to one degree or another caused by the limitedness of both the network resources themselves and the resources of the team of administrators.



Most of the time, employees are engaged in performing routine operations, although the load associated with the introduction of new services is constantly growing. They require testing, checking for mutual influence with other services, etc. Any implementation carries the risk of destroying what is already working. As a result, the staff is overwhelmed.



Perhaps this explains the following figure: up to 40% of complex data center problems are caused by human error. Any changes in the network, such as launching new applications, deploying services, etc., require a lot of attention and numerous checks, for which there is not always enough working time. The result can be a serious accident in the data center.



How much time is spent on solving this or that problem? Our data suggests that, on average, it takes almost 80 minutes to detect a fault alone. And these malfunctions are not always associated with physical devices. They can occur at the protocol level, service availability, etc.



As a result, network support works day and night, but still becomes the target of numerous complaints. For many of them, there would be no reason if the data center network acquired some autonomy.







Let's go back to the classification of autonomy levels proposed by IDC. Here is a list of the capabilities that the network should demonstrate at each of these levels. Solution Huawei Autonomous Driving Network meets all the requirements of the third level. It is able to maintain its work in a fully automatic mode, including starting and stopping processes, setting up equipment, etc. In addition, our ADN fully complies with the awareness criterion, receiving real-time information about the state of devices, processes, applications and services.



In semi-automatic mode, ADN is able to analyze what is happening on the network, identifying the causes of events and suggesting recommendations for their elimination. By 2023, we plan to add a feedback feature to ADN capabilities.



The management system will learn to cope with network problems using practices that have proven effective in other similar infrastructures, including those owned by other companies.



In accordance with our roadmap, by 2028 we will have a system fully corresponding to the fifth level of autonomy.







What will be the effect of introducing autonomous network management? Let's start by designing the network. With Huawei Autonomous Driving Network, the customer does not need to manually create the architecture or design or configure the devices. The system only asks to indicate how many devices and links of a certain bandwidth should be used. It then automatically assembles the network infrastructure and offers it as a turnkey solution. The customer immediately receives a fully operational data center factory.



But getting the network infrastructure is not enough. It must ensure the operability of virtual machines, applications and other processes, each of which has its own requirements for the bandwidth of certain channels. An autonomous network can analyze the load and give recommendations for the optimal organization of information flows.



During operation, ADN constantly checks the passage of traffic, among other things, identifying the mutual influence of various services on each other. This allows you to improve the quality of the network in real time, eliminating emerging bottlenecks.



Optimization is carried out continuously. If the system detects a deterioration in service, it immediately informs the operator about it, who only needs to make a prepared decision. If, for example, ADN notices the degradation of the optical module, it will count the number of processes affected by the problem and offer to use the backup channel.



All of the above capabilities allow ADN to play an extremely important role - saving the time spent supporting the network of technical personnel, freeing them up to perform higher-level tasks.







The strength of Huawei Autonomous Driving Network is that it is not just software that can be installed and served. The system implements a three-tier model, the basic level of which is located already at the level of processors of the final switching and routing devices. These hardware and software elements perform tasks of collecting and analyzing data, as well as switching streams and frames. The switch equipped with such a processor transmits information in real time towards the software platform, which in our case is the iMaster NCE .



It is the architecture of our ADN that sets it apart from other comparable products. Integration with hardware elements allows for a unique in depth analysis, making it possible to implement the processes of automatic configuration of network design, installation of network devices, etc. You can, for example, create a "virtual twin" of the application and verify the service in the existing infrastructure. The result will be a detailed report that includes a list of potential problem locations.



It remains to be noted that ADN is a service-oriented solution that makes extensive use of the capabilities of cloud technologies. We have already mentioned above that at the fifth level of autonomy, the network must be able to use algorithms for dealing with faults, formed on the basis of the experience of other customers and industry experts. It is from the cloud that ADN will soon learn to get solutions for certain network problems identified based on signatures.



The approaches used to create the ADN allow us to once again recall our 1-3-5 principle: any problem in the network must be identified in one minute, localized in three minutes, and fixed in five minutes.







Summarize. Of course, ADN is the successor to SDN solutions. This was a necessary stage in the development of technology, but there were some drawbacks in it. First, the use of software-defined networks implied manual initial configuration of devices. Second, identifying errors also fell on the shoulders of the network support specialists. Third, in the case of SDN, of course, there was no talk about the automatic application of recovery scripts obtained from the cloud-based knowledge base. With its ADN solution, Huawei aimed to free our customers from these tasks by focusing on what really needs attention.



All Articles