How we overclocked a cluster for loaded Microsoft SQL databases and received the coveted 200,000 IOPS

In the past year, we actively took up the performance of large, heavy databases in our cloud. At first glance, it seemed that we only had 2 options: inexpensive storage systems with slow disks or very expensive storage systems with fast ones. 





We wanted to speed up the work of highly loaded Microsoft SQL databases and at the same time offer our clients a favorable cost of the service. As a result of the tests, we have assembled the " Cluster for loaded Microsoft SQL databases in the cloud " solution . Today we will take a look inside and add a little more technical introductory and specific numbers. 





The post does not claim to be deep dive and does not reveal all the technical nuances, but only demonstrates the results of our testing. I will show you on what hardware, software and network configuration we ran the database performance tests, how we tested it, and what results we got. 





Problem conditions: how to check the performance of the database

How the hardware was chosen for the cluster . At the start, we looked for servers with the following characteristics: 





  • - 1U. - - 2U, "" . 1U : .  





  • 10 U.2. NVM. , .  





  • Intel Optane DC Persistent Memory





  • Hardware compatibility list (HCL) Microsoft – .









Supermicro 1029U-TN10RT:





, - 1U, 2 Intel Xeon Scalable. 





:





- – Ultra 1U SYS-1029U-TN10RT.





- CPU – 2 x Intel Xeon Gold 6246 (3.3GHz, 12C).





- Storage – 10 x Intel DC P4510 1TB NVMe SSD, 1DWPD.





- DRAM – 12 x 64GB DDR4-2666.





- Persistent Memory – 2 x 128GB DDR4-2666 Intel Optane DC PMMs.





- Network – 2 x 25GbE Mellanox ConnectX-4 Lx.





2,5 NVMe: 10 U.2.





. Windows Server 2019 Storage Spaces Direct. RAID – . 





. . 3-way Mirroring, 3 . 





– StorageRack. , . , . 





. . . RDMA – . Mellanox ConnectX-4 Lx c RoCEv2 (RDMA over Converged Ethernet).





Thanks to RoCE, we offload the transport and processor.  I took the picture from Mellanox.
RoCE . Mellanox.

:  

. VMFleet Microsoft, FIO.





. "" . 150 c "" 40 GB, 50 . – 4:1, CPU – 60%. – 3, 3 TB .





.





CPU Oversubscription 4:1





Pattern: t1, o32, b16k





Metrics





100% Random Read





90% Random Read/ 10% Random Write





70% Random Read/ 30% Random Write





IOPS per Volume





475000





275000





169000





Latency per Volume





0,2 ms





0,2 ms / 0,4 ms





0,2 ms / 0,4 ms





BW (MB/s) per Volume





7750





4500





2750





IOPS per VM





9500





5500





3380





BW (MB/s) per VM





155





90





55





IOPS per GB





237





137





84





Pattern: t1, o32, b4k





Metrics





100% Random Read





90% Random Read/ 10% Random Write





70% Random Read/ 30% Random Write





IOPS per Volume





509000





282000





190000





Latency per Volume





0,12 ms





0,12 ms / 0,33 ms





0,13 ms / 0,36 ms





BW (MB/s) per Volume





2000





1150





780





IOPS per VM





10180





5640





3800





BW (MB/s) per VM





40





23





15





IOPS per GB





254





112





76





Pattern: t1, o32, b2m





Metrics





100% Sequential Read





BW (MB/s) per Volume





19000





BW (MB/s) per VM





380





. , , . 2:1 ( 25 ), CPU . : 100% 4 4 16 . .





We see that Read Lat delays are quite low.
, Read Lat .

FIO , .





DBaaS Microsoft SQL . 4 200 000 IOPS 1 100% 4k.





Windows Server 2019 Storage Spaces Direct. !








All Articles