10 amazingly useful basic Python functions

image



Those who work with Python know that this language is good because of its vast ecosystem. You could even say that the programming language would not stand out in anything special, if not for its wonderful packages that add new functions to the main ones.



NumPy is an example. The matrix tools are fine in basic Python too, but using NumPy improves things many times over. In addition, this language has some cool features that make it even more functional. By using these capabilities, you can reduce the number of dependencies, free up time and simplify the development process itself. Let's see what these possibilities are.



By the way, Alexey Nekrasov, the leader of the Python department at MTS, and the program director of the Python department at Skillbox, added his advice on some functions. To make it clear where the translation is, and where the comments are, we will highlight the latter with text.



# 1 lambda



I once wrote an entire article on why lambda makes Python the optimal programming language for statistical computing. Thanks to this feature, mathematical operations can be applied to almost any type of data, not using entire functions, but evaluating expressions.



It allows definitions to be introduced globally, as well as functional-like syntax and methodology, in a language that still has a class structure.



All this allows you to save time while writing the program, save resources and make the code more concise. Moreover, lambda allows you to use methods such as apply () to quickly apply expressions to all subsets of your data. For a data scientist, and not only for representatives of this profession, such opportunities are extremely useful.



The syntax is as follows. We start with the return value of the lambda expression, followed by the variable we would like to supply as a positional argument. After that, we perform the operation using this argument as a variable:



mean = lambda x : sum(x) / len(x)
      
      





Now we can make the call, just like with any other method in Python:



x = [5, 10, 15, 20]
print(mean(x))
      
      





Alexey's comment:



Be careful with lambda so as not to impair code readability. Here are a couple of tips:

From PEP8. Always use the def statement instead of the assignment operator, which binds the lambda expression directly to an identifier:



Correct:



def f (x): return 2 * x
      
      





Wrong:



f = lambda x: 2 * x
      
      





If the length of the lambda expression is more than 40 characters, then most likely you have put too much logic in one line of code and it has become unreadable. You shouldn't do that, it's better to put it into a separate function.


# 2: Shutil



The Shutil module is one of the most underrated tools in the Python arsenal. It is included in the standard library, and can be imported just like any other module in the language:



import shutil
      
      





What does shutil do? In fact, it is a high-level interface to the Python programming language with respect to your OS's filesystem. These calls are often made using the os module; don't forget about shutil. You've probably had to move a file from directory to directory using a script, doing a lot of tedious work, right?



Shutil solves these classic file and allocation table problems with a high-level solution. This is the key to save time and speed up file operations. Here are some examples of high-level calls shutil provides.



import shutil
shutil.copyfile('mydatabase.db', 'archive.db')
shutil.move('/src/High.py', '/packages/High')
      
      





# 3: glob



Glob may not be as awesome as shutil, plus it was not even close to lambda in terms of usefulness. But it is irreplaceable in some cases. This module is used to find directories for wildcards. This means that it can be used to aggregate data about files on your PC and their extensions. The module is imported without problems:



import glob
      
      





I'm not sure if this module has more functionality, but glob () is what it takes to perform file lookups. The search uses Unix syntax, i.e. those. *, / etc.



glob.glob('*.ipynb')
      
      





This string returns all filenames matching the specified query. The function can be used both for data aggregation and simply for working with files.



# 4: argparse



This module provides a robust and deep method for parsing command line arguments. Many development tools use this concept, and you can work with all of this using the Unix command line. A great example is Python Gunicorn, which handles passed command line arguments. To start working with a module, you need to import it.



import argparse
      
      





Then, to be able to work with it, we build a new type, this will be the argument parser:



parser = argparse.ArgumentParser(prog = 'top',
description = 'Show top lines from the file')
      
      





Now we are adding new arguments to our parser. In this case, we create an argument that can be passed to determine the number of lines we want to output for each file:



parser.add_argument('-l', '--lines', type=int, default=10)
      
      





Several keyword arguments have been added here, one of which will provide the data type that is passed for that argument, and the other will provide a default value when the file is called without this argument. We can now get the arguments by calling the parse_args () function on our new argument parser type:



args = parser.parse_args()
      
      





We can now call this Python file to compile and also easily provide the required options from Bash.



python top.py --lines=5 examplefile.txt
      
      





Needless to say, this can definitely come in handy. I have used this module a lot when working with Crontab. It can run scripts with specific Unix timestamps. In addition, this script can also be used for supervisors who run Bash commands without user intervention as a worker.



# 5: import re



Another highly underrated module. The re module is used for parsing strings using regular expressions and provides more options for working with strings in Python. How many times have you encountered making algorithmic decisions based on functions that are in a string class like str.split ()? But stop putting up with it! After all, regular expressions are much simpler and much easier to use!



import re
      
      





The re module, unlike some of the others on this list, provides not one but many extremely useful functions. They are especially relevant for working with large amounts of data, which is important for data scientists. Two examples to start with are the sub () and findall () functions.



import re
re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'
      
      





:



regex :



  • re.compile. re.compile ( ) regex.
  • re.compile regex.
  • re.VERBOSE. re.compile re.VERBOSE ( ) regex . .


:





pattern = '^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$'
re.search(pattern, 'MDLV')
      
      









pattern = '''
    ^                   # beginning of string
    M{0,3}              # thousands - 0 to 3 Ms
    (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 Cs),
                        #            or 500-800 (D, followed by 0 to 3 Cs)
    (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 Xs),
                        #        or 50-80 (L, followed by 0 to 3 Xs)
    (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 Is),
                        #        or 5-8 (V, followed by 0 to 3 Is)
    $                   # end of string
    '''
re.search(pattern, 'M', re.VERBOSE)
      
      





  • python raw string regex.
  • Named capture groups for all capture groups if there is more than one (? P ...). (even if there is only one capture, it is also better to use)

    regex101.com is a great site for debugging and checking regex



# 6: Math



This is not the greatest module in history, but it is often useful. The math module gives you access to everything from sin and cos to logarithms. All of this is extremely important when working with algorithms.



import math
      
      





A module can certainly save some time by making math operations available without dependencies. In this example, I'll demonstrate the log () function, but if you dig deeper into the module, a whole world opens up.



import math
math.log(1024, 2)
      
      





# 7: Statistics



Another module that is extremely useful for statistical calculations. It gives access to basic statistics - not as deep as in the case of SCiPy, but it can be enough for data analysis. The alias of this module is st, in some cases stc or sts. But, attention - not scs, this is an alias for Scipy.stats.



import statistics as st
      
      





This module provides many useful features that are worth looking out for! The great thing about this package is that it doesn't have any dependencies. Let's take a look at some basic general-purpose statistical operations:



import statistics as st
st.mean(data)
st.median(data)
 
st.variance(data)
      
      





# 8: urllib



If many of the other modules on this list are not well known, then urlib is an exception. Let's import it!



import urllib
      
      





Flask can be used instead as it is more functional. But for most of the basic functions, the capabilities of the standard library are enough, which makes it possible not to worry about dependencies. Of course, if additional features are needed, then in this case it is worth paying attention to something else. But if we are talking about an HTTP request, then urlib will do what it needs.



from urllib.request import urlopen
 
data = null
with urlopen('http://example_url/') as response: data = response
      
      





The urlib module is something I highly recommend learning more.



# 9: datetime



Another great example of a tool that is quite common in scientific computing is the date and time type. Very often the data has time stamps. Sometimes they are even a predictive function used to train the model. This module is often used with the dt alias:



import datetime as dt
      
      





We can now create date and time types and work with typical date and time syntax with properties including year, month, and day. This is incredibly useful for reformatting, analyzing, and working with specific sections of dates in your data. Let's take a look at some of the main features of this package:



import datetime as dt
now = dt.date.today()
print(now.year)
print(now.month)
      
      





# 10: zlib



The last contributor to this list is the zlib module. It is a versatile data compression solution using the Python programming language. The module is extremely useful when working with packages.



import zlib
      
      





The most important functions here are compress () and decompress ().



h = " Hello, it is me, you're friend Emmett!"print(len(h))
t = zlib.compress(h)
print(len(t))
z = decompress(t)
print(len(z))
      
      





As a conclusion, I will say that programming in Python sometimes seems difficult due to the large number of dependencies. And the standard language library allows you to partially get rid of this problem. In addition, standard Python tools can save time, reduce the amount of code, and make it more readable.



All Articles