Optimization: Configuring Nginx web server to improve RPS performance in HTTP API



Before scaling and scaling your infrastructure, the first step is to ensure that resources are being used correctly and that the application configuration does not bottleneck its performance. The main goal of the engineering team is to ensure the continuous, uninterrupted operation of any designed and deployed system with minimal resources.



We ran into the above issue where our deployed system was being used daily by a million users who connected in bursts from time to time. This means that deploying multiple servers or scaling them will not be the best solution in this situation.



This article is about tuning Nginx to improve performance, that is, to increase the RPS (Requests Per Second) in the HTTP API. I tried to tell you about the optimization that we applied in the deployed system to process tens of thousands of requests per second without wasting a huge amount of resources.



Action plan: you need to run the HTTP API (written in Python using flask), proxied with Nginx; high bandwidth is required. The API content will change at one day intervals.



optimization

noun



process of achieving the best result; the most efficient use of a situation or resource.


We used supervisor to start WSGI Server with the following configurations:



  • Gunicorn with Meinheld workers
  • Number of workers: number of CPUs * 2 + 1
  • Bind the socket to a Unix address instead of an IP, this will slightly increase speed .


The supervisor command looks like this:



gunicorn api:app --workers=5 --worker-
class=meinheld.gmeinheld.MeinheldWorker --bind=unix:api.sock


We tried optimizing the Nginx configuration and checked what worked best for us.



To evaluate the performance of the API, we used wrk with the following command:



wrk -t20 -c200 -d20s http://api.endpoint/resource


Default configuration



We first performed load testing of the API without any changes and got the following statistics:



Running 20s test @ http://api.endpoint/resource
  20 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   192.48ms  274.78ms   1.97s    87.18%
    Req/Sec    85.57     29.20   202.00     72.83%
  33329 requests in 20.03s, 29.59MB read
  Socket errors: connect 0, read 0, write 0, timeout 85
Requests/sec:   1663.71
Transfer/sec:      1.48MB


Updating the default configuration



Let's update the default Nginx config i.e. nginx.conf in /etc/nginx/nginx.conf



worker_processes auto;
#or should be equal to the CPU core, you can use `grep processor /proc/cpuinfo | wc -l` to find; auto does it implicitly.

worker_connections 1024;
# default is 768; find optimum value for your server by `ulimit -n`

access_log off;
# to boost I/O on HDD we can disable access logs
# this prevent nginx from logging every action in a log file named `access.log`.

keepalive_timeout 15;
# default is 65;
# server will close connection after this time (in seconds)

gzip_vary on;
gzip_proxied any;
gzip_comp_level 2;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_min_length 256;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
# reduces the data that needs to be sent over the network
nginx.conf (/etc/nginx/nginx.conf)



After the changes, we run the configuration check:



sudo nginx -t


If the check is successful, you can restart Nginx to reflect the changes:



sudo service nginx restart


With this configuration, we performed load testing of the API and got the following result:



Running 20s test @ http://api.endpoint/resource
  20 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   145.80ms  237.97ms   1.95s    89.51%
    Req/Sec   107.99     41.34   202.00     66.09%
  42898 requests in 20.03s, 39.03MB read
  Socket errors: connect 0, read 0, write 0, timeout 46
  Non-2xx or 3xx responses: 2
Requests/sec:   2141.48
Transfer/sec:      1.95MB


These configurations reduced timeouts and increased RPS (requests per second), but not much.



Adding Nginx Cache



Since, in our case, the content of the endpoint will be updated at an interval of one day, this creates a suitable environment for caching API responses.



But adding cache makes it invalid ... this is one of two difficulties here.

In computer science, there are only two complications: invalidating the cache and naming things. - Phil Carlton



We choose an easy solution to clear the cache directory with a cronjob after updating the content on the downstream system.



Next, Nginx will do all the hard work, but now we need to be sure that Nginx is 100% ready!



To add caching to Nginx, you need to add several directives to the Nginx configuration file.



Before that, we need to create a directory to store the cache data:



sudo mkdir -p /data/nginx/cache


Nginx configuration changes:



proxy_cache_path /data/nginx/cache keys_zone=my_zone:10m inactive=1d;
server {
    ...
    location /api-endpoint/ {
        proxy_cache my_zone;
        proxy_cache_key "$host$request_uri$http_authorization";
        proxy_cache_valid 404 302 1m;
        proxy_cache_valid 200 1d;
        add_header X-Cache-Status $upstream_cache_status;
    }
    ...
}


Caching Proxied Requests (Nginx Configuration)



After this configuration change, we load tested the API and got the following result:



Running 20s test @ http://api.endpoint/resource
  20 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     6.88ms    5.44ms  88.91ms   81.36%
    Req/Sec     1.59k   500.04     2.95k    62.50%
  634405 requests in 20.06s, 589.86MB read
Requests/sec:  31624.93
Transfer/sec:     29.40MB


Thus, we got an almost 19x increase in performance by adding caching.

Note from a Timeweb expert :



It is important to remember that caching queries that write to the database will result in a cached response, but no write to the database.

Nginx cache in RAM (Random Access Memory)



Let's take it one step further! Currently, our cache data is stored on disk. What if we save this data in RAM? In our case, the response data is limited and not large.



So, first you need to create a directory where the RAM cache will be mounted:



sudo mkdir -p /data/nginx/ramcache


To mount the created directory in RAM using tmpfs , use the command:



sudo mount -t tmpfs -o size=256M tmpfs /data/nginx/ramcache


This mounts / data / nginx / ramcache in RAM, allocating 256MB.



If you think you want to disable RAM cache, just run the command:



sudo umount /data/nginx/ramcache


To automatically re-create the cache directory in RAM after reboot, we need to update the / etc / fstab file . Add the following line to it:



tmpfs /data/nginx/ramcache tmpfs defaults,size=256M 0 0


Note: We must also register the proxy_cache_path valuewith the path to ramcache ( / data / nginx / ramcache ).



After updating the configuration, we again performed API load testing and received the following result:



Running 20s test @ http://api.endpoint/resource
  20 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.57ms    5.69ms 277.76ms   92.94%
    Req/Sec     1.98k   403.94     4.55k    71.77%
  789306 requests in 20.04s, 733.89MB read
Requests/sec:  39387.13
Transfer/sec:     36.62MB


Storing the cache in RAM resulted in a significant improvement of almost 23 times .



Buffered Access Log



We keep a log of access to proxied applications, but you can first save the log in a buffer and only then write it to disk:



  • if the next line of the log does not fit into the buffer
  • if the data in the buffer is older than specified in the flush parameter .


This procedure will reduce the recording frequency performed with each request. To do this, we just need to add the buffer and flush parameters with the appropriate value in the access_log directive :



location / {
    ...
    access_log /var/log/nginx/fast_api.log combined buffer=256k flush=10s;
    error_log /var/log/nginx/fast_api.err.log;
}


Buffer log before



being written to disk Thus, according to the above configuration, initially the access logs will be buffered and saved to disk only when the buffer reaches 256KB or the buffered data is older than 10 seconds.



Note: The name is log_format combined here .



After repeated load testing, we got the following result:



Running 20s test @ http://api.endpoint/resource
  20 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.21ms    3.19ms  84.83ms   83.84%
    Req/Sec     2.53k   379.87     6.02k    77.05%
  1009771 requests in 20.03s, 849.31MB read
Requests/sec:  50413.44
Transfer/sec:     42.40MB


This configuration significantly increased the number of requests per second, about 30 times compared to the initial stage.



Output



In this article, we discussed the process of optimizing Nginx configuration to improve RPS performance. The RPS has been increased from 1663 to ~ 50413 ( an increase of about 30 times ), which provides high throughput. By adjusting the default settings, you can improve system performance.



Let's end the article with a quote:

Make it work first. Then do it right. Then optimize. - Kent Beck

Sources






All Articles