Searching for vulnerabilities in Python code using the open source Bandit tool





Probably all developers have heard that they need to write clean code. But it is equally important to write and use safe code.



Python developers usually install modules and third-party packages in order not to reinvent the wheel, but to use ready-made and proven solutions. But the problem is that they are not always thoroughly tested for vulnerabilities. 



Hackers often use these vulnerabilities for what purposes are known. Therefore, we must be able to at least record the facts of intrusion into our code. Better yet, fix vulnerabilities in advance. To do this, you first need to find them in the code yourself using special tools.



In this tutorial, we will look at how vulnerabilities can appear even in very simple code and how to use the Bandit utility. to find them.



The most common vulnerabilities in Python code 



You've probably heard of large websites being hacked and their user data stolen. Perhaps you have personally encountered some kind of attacks. Through vulnerabilities in our code, attackers can gain access to operating system commands or data. Some functions or Python packages may seem safe when you use them locally. However, when deploying a product on a server, they open doors for hackers.



Modern frameworks and other smart software development tools that have built-in protection more or less cope with the most common attacks. But it is clear that not with everyone and not always.



Team injection (command injection)



Command injection is a type of attack, the purpose of which is to execute arbitrary commands in the server OS. The attack is triggered, for example, when a process is started using the functions of the subprocess module , when values ​​stored in program variables are used as arguments. 



In this example, we use the subprocess module to perform an nslookup and get the domain information:



# nslookup.py

import subprocess

domain = input("Enter the Domain: ")

output = subprocess.check_output(f"nslookup {domain}", shell=True, encoding='UTF-8')

print(output)
      
      





What could go wrong here?



The end user must enter the domain and the script must return the output of the nslookup command. But, if you also enter the ls command along with the domain name separated by a semicolon , both commands will be run:



$ python3 nslookup.py

Enter the Domain: stackabuse.com ; ls

Server:         218.248.112.65

Address:        218.248.112.65#53

Non-authoritative answer:

Name:   stackabuse.com

Address: 172.67.136.166

Name:   stackabuse.com

Address: 104.21.62.141

Name:   stackabuse.com

Address: 2606:4700:3034::ac43:88a6

Name:   stackabuse.com

Address: 2606:4700:3036::6815:3e8d

config.yml

nslookup.py
      
      





Using this vulnerability, you can execute commands at the OS level (after all, we have shell = true ).



Imagine what would happen if an attacker, for example, issues a cat command for / etc / passwd, which will reveal the passwords of existing users. So using the subprocess module can be very risky.



SQL injection



SQL injection is an attack that constructs a SQL statement containing malicious queries from user input. Due to the active use of ORM, the number of such attacks has significantly decreased. But if you still have chunks written in pure SQL in your codebase, you need to know how those SQL queries are constructed. How safe are the arguments you validate and supply to the request?



Let's consider an example:



from django.db import connection

def find_user(username):

    with connection.cursor() as cur:

        cur.execute(f"""select username from USERS where name = '%s'""" % username)

        output = cur.fetchone()

    return output
      
      





Everything is simple here: we pass, for example, the string "Foobar" as an argument. The string is inserted into the SQL query, resulting in:



select username from USERS where name = 'Foobar'
      
      





The same as in the case of command injection - if someone passes the character β€œ ; ", He will be able to execute several commands. For example, let's add the line β€œ '; DROP TABLE USERS; - "and get:



select username from USERS where name = ''; DROP TABLE USERS; --'
      
      





This query will drop the entire USERS table. Oops!



Notice the double hyphen at the end of the request. This is a comment that neutralizes the following character " ' ". As a result, the select command will run with the argument β€œ '' ” instead of the username, and then the DROP command will be executed , which is no longer part of the string.



select username from USERS where name = '';

DROP TABLE USERS;
      
      





SQL query arguments can create a lot of problems if left unchecked. This is where security analysis tools can help a lot. They allow you to find vulnerabilities in the code that developers inadvertently introduced.



Assert command



Do not use the assert command to protect parts of the codebase that users should not be able to access. Simple example:



def foo(request, user): 

      assert user.is_admin, "user does not have access" 

     #      
      
      





By default, __debug__ is set to True. However, there are a number of optimizations that can be made in production, including setting __debug__ to False. In this case, the assert commands will fail and the attacker will reach the restricted code.



Use the assert command only to send messages about implementation nuances to other developers.



Bandit



Bandit is an open source tool written in Python. It helps analyze Python code and find the most common vulnerabilities in it. I talked about some of them in the previous section. Using the pip package manager , Bandit can be easily installed locally or on a remote virtual machine for example. 



This thing is installed using a simple command:



$ pip install bandit
      
      





Bandit has found applications in two main areas:



  1. DevSecOps : as one of the Continuous Integration (CI) processes.
  2. Development : As part of the local developer toolkit, used to test code for vulnerabilities before committing.


How to use Bandit



Bandit can be easily integrated as part of CI tests, and vulnerability checks can be performed before sending code to production. For example, DevSecOps engineers can launch Bandit whenever a pull request or code commit occurs. 



The results of checking the code for vulnerabilities can be exported to CSV, JSON, and so on.



Many companies have restrictions and bans on the use of some modules, because they are associated with certain, well-known in narrow circles, vulnerabilities. Bandit can show which modules can be used and which are blacklisted: test configurations for the corresponding vulnerabilities are stored in a special file. It can be created using the Config Generator ( bandit-config-generator):



$ bandit-config-generator -o config.yml
      
      







The generated config.yml file contains configuration blocks for all tests and blacklist. This data can be deleted or edited. To specify a list of test identifiers that should be included or excluded from the verification procedure, use the -t and -s flags :



  • -t TESTS, --tests TESTS , where TESTS is a list of test identifiers (in square brackets, separated by commas) to be included.



  •  -s SKIPS, --skip SKIPS, where SKIPS is a list of test identifiers (in square brackets, separated by commas) to be excluded.


The easiest way is to use a config file with default settings.



$  bandit -r code/ -f csv -o out.csv

[main]  INFO    profile include tests: None

[main]  INFO    profile exclude tests: None

[main]  INFO    cli include tests: None

[main]  INFO    cli exclude tests: None

[main]  INFO    running on Python 3.8.5

434 [0.. 50.. 100.. 150.. 200.. 250.. 300.. 350.. 400.. ]

[csv]   INFO    CSV output written to file: out.csv

      
      





In the command above, after the -r flag, the project directory is indicated, after the -f flag, the output format, and after the -o flag, a file is specified to which the results of the check should be written. Bandit checks all python code inside the project directory and returns the result in CSV format. 



After checking, we will receive a lot of information:







Continuing the Table



As mentioned in the previous section, importing the subprocess module and using the shell = True argument in the subprocess.check_output function call pose a serious security risk. If the use of this module and argument is unavoidable, they can be whitelisted in the config file and made Bandit skip tests by including B602 (subprocess_popen_with_shell_equals_true) and B404 (import_subprocess) IDs in the SKIPS list :



$ bandit-config-generator -s [B602, B404] -o config.yml 
      
      





If we rerun Bandit using the new configuration file, the output will be an empty CSV file. This means that all tests have passed:



> bandit -c code/config.yml -r code/ -f csv -o out2.csv

[main]  INFO    profile include tests: None

[main]  INFO    profile exclude tests: B404,B602

[main]  INFO    cli include tests: None

[main]  INFO    cli exclude tests: None

[main]  INFO    using config: code/config.yml

[main]  INFO    running on Python 3.8.5

434 [0.. 50.. 100.. 150.. 200.. 250.. 300.. 350.. 400.. ]

[csv]   INFO    CSV output written to file: out2.csv
      
      







In a team development environment, for each project, its own configuration files should be created. Developers need to be able to edit them at any time - including locally.



What is more important to you?



This was a short tutorial on the basics of working with Bandit. If you use modules in your projects that you doubt, you can check them for vulnerabilities right now. And we sometimes do not have time to bring to mind our own code, sacrificing not only beautiful solutions, but also about security. What are your priorities?






Macleod VPS servers are fast and secure.



Register using the link above or by clicking on the banner and get a 10% discount for the first month of renting a server of any configuration!






All Articles