Checking the Internet speed with the Requests library in multiprocessing

Good time, dear residents of Habr!



Today we will talk about how, from the idea of ​​measuring the speed, a script was created to download an image file and send it back to the server, with the calculation of the execution time of each of the functions and the calculation of the speed.



I'll start with a list of used libraries:



  • import os
  • from multiprocessing import Pool
  • import time
  • import pandas as pd
  • import requests


Next, we need a list of servers, I preferred to create a dictionary for this:



server_list = [
    {
        'server_id': 3682,
        'download': 'http://moscow.speedtest.rt.ru:8080/speedtest/random7000x7000.jpg',
        'upload': 'http://moscow.speedtest.rt.ru:8080/speedtest/upload.php'
    }
]


Let's write the first function:



def download(id, path):
    start = time.time()
    file_name = str(id) + str(path.split('/')[-1])
    try:
        r = requests.get(path, stream=True, timeout=5)
    except:
        return 0
    size = int(r.headers.get('Content-Length', 0))
    with open(file_name, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)

    end = time.time()
    duration = end - start
    sp = (((size * 8) / 1024) / 1024) / duration

    return sp


Now more about what is happening.



The function has a start time and an end time (in seconds), from which later we get the lifetime. We write the server id and the name of the image into the file name (done so that there are no conflicts when loading from multiple sources). Next, we make a GET request, get the file size (in bytes) and save it to disk. We translate bytes into bits, a little more magic with formulas and at the output we have a speed in MBit / s.



The next function is to upload a file to the server:



def upload(id, path):
    start = time.time()
    file_name = str(id) + 'random7000x7000.jpg'
    with open(file_name, 'rb') as f:
        files = {'Upload': (file_name, f.read())}
    try:
        requests.post(path, files=files)
    except:
        return 0
    size = os.path.getsize(file_name)
    end = time.time()
    duration = end - start
    sp = (((size * 8) / 1024) / 1024) / duration

    return sp


The principle is the same here, only we take a file from a local folder and send it with a POST request.



Our next task is to get data from the two previous functions. Let's write one more function:



def test_f(conn, server):
    speed_download = download(server['server_id'], server['download'])
    speed_upload = upload(server['server_id'], server['upload'])
    return server['server_id'], speed_download, speed_upload
    


The only thing left to do is to add multiprocessing with a pool and a parallel map function :



def main():
    pool = Pool()
    data = pool.map(test_f, server_list)

    df = pd.DataFrame(data, columns=['Server', 'Download', 'Upload'])
    print(df)

    pool.close()
    pool.join()




if __name__ == '__main__':
    main()


The script is ready to use, for the convenience of the output, I used the pandas library. You can also put the output in the database and collect statistics for analysis.



Thank you for attention!



UPD: Corrected exceptions, made corrections to the work of multiprocessing (replaced the loop with a parallel function), added a timeout for the GET request



All Articles