More than 10 years ago, Microsoft announced the availability of the Azure platform to a wide audience of users. During this time, many companies wanted to take advantage of the cloud infrastructure to solve current IT problems. Some of them contacted us for advice on deploying systems in the cloud. But time passed and now the business is considering the possibility of placing such resource-intensive systems as SAP HANA in the clouds. We, in turn, have already implemented several similar projects and are ready to share those observations that can ensure a more successful deployment of the SAP system in the MS Azure cloud. Some observations will not be a discovery, but we wanted to show the applicability of some approaches in a cloud platform. We want to share the main lessons learned with you.
# 1: Consistently Optimize IT Architecture Using the Cloud
As part of the migration of a productive system, we faced the problem of high resource consumption in the testing and development processes, which ultimately made us think about why we need so many resources for the Test and Dev environment and how to optimize the consumption of infrastructure resources by a productive system.
The recommended approach to building a SAP system is based on an assessment of the required capacities using the SAP Quick Sizer calculator. Until now, SAP methodologies are based on standard approaches that do not take into account the peculiarities of cloud technologies. We received the requirements from the Customer, entered the initial data into the SAP estimators and received a preliminary landscape configuration. In the case of conventional infrastructure, it was possible to stop there and move on to purchasing equipment, but in our case it was possible to take advantage of the cloud. In the cloud, resources can be increased at any time, and therefore we have abandoned the excess resource reserve included in the estimator and deployed machines of lower performance. This allowed us to reduce costs,and as the load increases, we can always increase the performance of SAP virtual machines in minutes.
Microsoft provides SAP support only on M-series VMs. The use of test resources similar to the production environment in terms of support level at the initial development stage seemed redundant to us.
At the same time, the E-series machines have similar characteristics to the M-series, but their cost is significantly less. As a result, we replaced the test machines with the E series. The downside of this replacement is the transfer of responsibility for the operation of the system in test environments from the provider to the integrator. This imposes the need for the integrator to have SAP expertise.
# 2: How to save on resource consumption
MS Azure allows you to significantly save when booking resources with a simultaneous prepayment for one or 3 years.
Often, at the initial stage, the customer cannot accurately estimate the launch date of a productive system, since its development and testing are often associated with many variables that are on the side of the business or contractor developers.
For example, at the time of the launch of one of the projects, we planned the simultaneous deployment of all environments based on the current plans of the Customer. As is often the case, the development required longer coordination with the business, which delayed the productive launch for several months.
In this example, a reservation of prepaid resources would result in the Customer losing a significant amount of funds. Of course, it is necessary to reserve resources, but it is more efficient to do this at later stages of the project, when the bulk of the productive system has stabilized, and the development has become predictable in terms of resource consumption. Often, when you reserve computing resources for 3 years, you can get about 70% savings compared to the Pay-As-You-Go payment method.
# 3: How to choose an Azure region
Azure has a wide range of resource hosting regions. One of the main criteria for choosing a region is the remoteness of your infrastructure from end users, which affects the response of the system and the operation of integrations and end users.
The second, no less important criterion is the list of services in a particular region.
Some services are available depending on the region. Very often, services are provided as a preview before the official release. Some regions are quicker to introduce new technologies and give a try out of one or another service that has appeared. Not all regions provide the ability to use the full range of virtual machine series.
In our practice, comparison of access speed often shows the advantage of the Western Europe region. This is especially noticeable for companies whose servers and clients are located in the European part of Russia. In each specific case, and especially if your data centers and customers are located in the Far East, it makes sense to check delays from your data center (or from the geographic region where your customers are located) to choose the best Azure region.
Services such as Azure Latency Test allow you to proactively test latency to each of your Azure regions from your data center network. An example of the results of measuring the latency using the mentioned service when testing from our Moscow office:
# 4: How to Apply Ground-Based Methods to Cloud Installations
In each migration, we ask ourselves the question of how to use traditional solutions in the cloud, proven by classic infrastructure. In particular, when preparing a solution, we rely on the vendor's recommendations to develop a technically correct solution. SAP HANA projects are no exception and also pass through the prism of recommendations and best practices.
On one of the projects, when implementing the first stage of the solution, a Windows-based Jump server was deployed. To optimize the costs of the initial development stage, an NFS server was deployed on the same server for the needs of unproductive environments, which covered the current needs of developers and allowed significant savings on resources.
As time went on, the environments and resource requirements grew, and the NFS server coped with all the tasks. Gradually, within the framework of the project, we approached the depletion of the resources of the initial VM. VM resources in MS Azure can be increased in minutes, but at the same time, the requirements for server fault tolerance have increased, which made us consider a larger-scale reconfiguration.
For implementation, a Linux server, DRBD service and the Availability set functionality were used, which closed the issue of data replication between the nodes of the NFS cluster and increased availability in the event of failure of one of the two cluster nodes.
By the way: a couple of months after the implementation of the cluster solution, the NetApp Files service was added to Azure, which allows you to take advantage of NetApp arrays, but paid for by the Pay-As-You-Go model.
# 5: How to Automate VM Schedule
When using any cloud infrastructure, it always makes sense to analyze what additional mechanisms can be used to maximize cost savings.
In our case, the systems are tested during business hours. Whereas in a conventional infrastructure server downtime translates mainly into increased energy bills, in the cloud, non-useful servers consume finance for renting capacity. We evaluated the load graphs on the testing and development servers and noticed that the overwhelming majority of developers and testers use the system on weekdays from 8-00 to 20-00.
In cases where the load schedule on unproductive systems is predictable and cyclical, we try to use scripts to automate the VM on / off. Azure has several tools: Auto-Shutdown, Automation Accounts, and Cloud Shell. Not all were suitable for us. Auto-shutdown was excluded because it can only shutdown the VM. Cloud Shell was also not involved, since it requires additional scripts to be prepared, charts developed and somewhere safe to store all this with redundancy, which reduced the savings to a minimum.
In our case, a more flexible mechanism is used. Automation Accounts offers a ready-made and working solution in the form of runbooks, allowing you to turn on and off virtual machines on a schedule.
We prepared graphs for the respective resources, loaded the runbooks, and formed links between the graphs and resources.
As a result, we have further reduced the total cost of ownership. Initially, these tools were not planned for implementation, but they were implemented already at the preliminary operation stage.
Changing the schedule is done in a few minutes. First, a new schedule is formed, then the generated schedule is associated with the required resource. If necessary and there are a large number of changes, it is possible to use the Cloud Shell to automate this process.
# 6: Monitoring resource consumption
While for conventional data centers, health monitoring and resource consumption, unfortunately, usually fades into the background, then within Azure it is undesirable to allow this. Information about the history of resource consumption directly affects the possibilities for cost optimization. And in the case of early reservation of resources, it can serve as a signal for architectural improvements.
# 7: Safety net in case of force majeure
Many companies are afraid to place their IT systems on cloud resources due to the fear of blockages, which were previously targeted by regulators. Like any other risk, this can also be taken into account when designing a system. In projects, we, as a rule, implement a weekly unloading of system backups from the cloud to the Customer's data center. This allows you to be sure that the system can be restored in any eventuality. In addition to this, we use a multi-cloud installation strategy, which will allow, in the event of access restrictions, not to be left without access to resources. In this scenario, alternative cloud providers are used as DR sites for the main cloud, which allows the system to be restored in the event of massive blockages.
No. 7 + Personal data processing
During our work, we have formed an approach to how to store and process personal data in systems that include foreign cloud resources. Note that the implementation of this approach should be carried out taking into account the requirements of regulators. This topic is quite extensive, and we will try to cover it in the following articles if we notice the corresponding interest in the comments.
Outcome
In this article, we reviewed several practical lessons on hosting SAP in the MS Azure cloud. Obviously, the topic is extremely broad and we were able to touch on only a part of the possible optimizations when migrating to the cloud.
What nuances have you encountered? We would be grateful if you share your experience in the comments.