Python Basics III

Functions

For small tasks chaining together command one by one in a script can be a viable solution but often the problem at hand consists of multiple steps and/or needs to repeated multiple times for example for different directories. Then the code will get messy pretty fast which, according to rule one of The Zen of Python, is something we want to avoid.

To achive this we can use functions which:

  • encapsulate small snippets of code
  • can be reused any number of times
  • make our code more modular, less repititous and easier to maintain
def happy_birthday():
    print("Happy birthday Tom!")
    
happy_birthday()
Happy birthday Tom!

Now this does not save very much code because if we wanted to congratulate Sally we would need to write another quite similar function. To make this function more generic we can use Parameters.

def happy_birthday(name):
    print("Happy birthday {}!".format(name))
    
happy_birthday(name="Tom")
Happy birthday Tom!

Functions can have any number of parameters which can also have a default value assigned to them, making them optional in the function call. Additionally a return value can be specified.

def x_power_y(x, y=4):
    return x**y

print(x_power_y(4, 6))
print(x_power_y(4))
4096
256

Documentation

While this is still an easy example where it’s pretty clear what is done in general it is important to document functions (and all other code structures that is). While normal inline comments like you already know can be used there are special “comments” called docstrings which can be inserted as a first statement in a function and are a standardized way of documenting.

def x_power_y(x, y=4):
    """
    Calculate x to the power of y.
    
    Parameters
    ----------
    x : number
        The base
    y : number, optional
        The exponent, defaults to 4
        
    Returns
    -------
    number
    
    """
    return x**y

The first line should contain a short one line description. Following a blank line a more detailed description of what the function is doing can be given (omitted in this example).

This example uses the so called numpydoc style which has the advantage to be very readable. Another popular style is the google style. The choice which one to use is up to you, just make sure you stick with that choice.

Documenting can be a lot of work but the advantages pay off quickly:

  • for your self to remember what a certain piece of code does if you want to use it sometime in the future
  • other people who use your code understand what is done and which input is needed and what is returned

So it is a good habit to document code right when it is written even though sometimes it can be tempting to do it later.

Built in functions

Python has many built in functions which solve common problems and are faster than if we would implement those on our own. Often, the purpose of these functions is clear on first sight. For example min(), max() or sum() are pretty self-explanatory, but there are also some more complex functions.

Task

  1. Go to the official Python documentation or use a search engine of your choice and have a look at the follwoing built in functions:
    • set
    • enumerate
    • map

    Find out what they do and try them out.

  2. Rewrite the following loop using the map function.
    numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
    for i, num in enumerate(numbers):
         numbers[i] = numbers[i]**2 + 5
    

Solution

  1. Built in functions
    • set returns a collection of distinct elements (e.g. removing duplicates from a list)
    • enumerate returns an object containing tuples of count and values of a list. For example:
      fruits = ["Banana", "Strawberry", "Peach"]
      list(set(fruits))
      

      returns:

      [(0, 'Banana'), (1, 'Strawberry'), (2, 'Peach')]
      
    • map applies a function to every element of an iterable (for example a list)
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

def quadratic(x):
    return x**2 + 5
    
map(quadratic, numbers)

Variable scope - global vs. local

Let’s say we define the following:

a = "Foo"

def very_usefull_function():
    print(a)

What will happen if we call very_usefull_function()?

very_usefull_function()
Foo

Now consider the following change to the function and think about what will happen when we first call the function and then print out a again afterwards.

a = "Foo"

def very_usefull_function():
    a = "Bar"
    print(a)
very_usefull_function()
print(a)
Bar
Foo

This shows the difference between global and local variables.

While we can access global variables within functions, they can not be changed by default. A variable with the same name inside a function is a local variable and will always take precedence over global variables. It exists only in the scope of this function. In generall it is possible to change the value of a global variable from inside a function but since accessing a global variable from inside a function is considered bad practice and hence trying to change it even worse we neglect that possibility altogether.

Constants

Often in other programming languages special variables can be defined which can not be changed and therefore be used for constants for example. In Python this is not possible. So a widely used convention is to mark constants with all upper case variable names.

SPEED_OF_LIGHT = 1080000000 #km/h

Python modules

When you write larger scripts, it makes sense to split your code into multiple files so that you do not lose track of all your functions. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program.

To support this, Python has a way to put definitions in one file and use them in another script. Such a file is called a module. Definitions from one module can be imported into other modules or into the main module.

A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended.

The import system

Lets say, we defined the following function (to print the Fibonacci numbers) in the file “fibo.py”:

def fibonacci(n):
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b
    print()

Now we want to use this function in another script. To import the definitions from a module into a script, you can use the import statement:

import fibo # Import the fibo-module (fibo.py)

fibo.fibonacci(90) # Use the fibonacci function from the fibo module
0 1 1 2 3 5 8 13 21 34 55 89 

You can also assign an alias to the imported module via:

import fibo as fb

fb.fibonacci(90)
0 1 1 2 3 5 8 13 21 34 55 89 

This is common practice as module names are sometimes long and coders like short module names in their code.

Sometimes you will see something like

from fibo import *

which would import all functions from the fibo module. This is considered bad practice because it is not concrete and specific (it violates rule 2 of the The Zen of Python ;-) ) and should not be done.

There are many modules with usefull functions out there which can be installed and used similarly to the way with our own modules shown above. We get to know some of the most common ones in later chapters of the course.

Task

  1. Write a function add5(x) that adds 5 to the given number x and returns the result. Save this function in the file “mathe.py”.
  2. Start an interactive python shell in the same directory where you saved the “mathe.py” file.
  3. Import the function add5(x), call it with some random arguments and print the results.
  4. Try out “import this”.

Solution

  1. def add5(x):
     return x + 5
    
  2. Start the shell

  3. from mathe import add5
    res = add5(10)
    print(res)
    
  4. import this
    

Working with files

Sometimes we need to access files on disk in our scripts. This can be done using the open(filename, access_mode) function. The access mode is one of:

  • r: read only
  • w: write only
  • a: appending to a file
  • r+: special mode to handle reading and writing

Reading and writing text files

The simplest file we can access are text files and there are some conveniant methods which help us writing to them:

  • write(): writes a line to a file
  • writelines(): given a list writes each item as a line

Using the with syntax we don’t have to worry about closing the file connection once we are done.

#Writing a shopping list to a text file
with open("shopping_list.txt", "w") as file:
    file.write("Milk\n")
    file.write("Cereal\n")
    file.write("Pizza\n")

Similarly there are functions to read from a file:

  • read(n): if n is given reads the first n characters from the file otherwise reads the entire file
  • readline(n): if n is given reads the n’th line from the file otherwise reads the first line
  • readlines(): reads the entire file line by line and returns them as a list
#Reading our list
with open("shopping_list.txt", "r") as file:
    lines = file.readlines()
    
print(lines)
['Milk\n', 'Cereal\n', 'Pizza\n']

Task

Notice the “\n” at the end of each line.

  1. Find out what the “\n” is all about (Hint: Try the code without it)
  2. Find out how to get rid of the “\n” when reading the file again.

File pathes and directories

Beyond writing to text files other interactions with the filesystem like creating, deleting files and directories or just creation of system compatible pathes is often needed. All of this and more can be achieved with the Path function of the pathlib module which ships with every Python 3 installation.

Create pathes

Instead of creating pathes by concatenating strings with the + operator it can be conveniantly be done like:

from pathlib import Path #first import the Path "function" from the module

# create the path object with an initial path. If no path is provided the current directory is used
basedir = Path("user/data")
print(basedir)
# create path to a sub directory with the "/" operator
# instead of a string the second argument could also be a Path object
subdir = basedir / "subdirectory"
print(subdir)
user/data
user/data/subdirectory

Check existance of / create / delete directories

# check if path exists
dir_exists = subdir.exists()
print("Directory exists: {}".format(dir_exists))
# create directory. If parents is True intermediate directories are created if they don't exist.
# exist_ok = True supresses errors is directory already exists.
subdir.mkdir(parents = True, exist_ok = True)
print("Directory exists after creation: {}".format(subdir.exists()))
# delete directory. Directory must be empty.
subdir.rmdir()
print("Directory exists after deletion: {}".format(subdir.exists()))
Directory exists: False
Directory exists after creation: True
Directory exists after deletion: False

List files and directories

img_dir = Path("images")

print("List files and directories")
flist = list(img_dir.glob("*"))
print(flist)

print("Recursive png file list:")
flist = list(img_dir.rglob("*.png"))
print(flist)
List files and directories
[PosixPath('images/S01E03'), PosixPath('images/S01E04'), PosixPath('images/S01E02')]
Recursive png file list:
[PosixPath('images/S01E03/string_explanation.png'), PosixPath('images/S01E04/beispiel_output.png'), PosixPath('images/S01E04/csv_example.png'), PosixPath('images/S01E02/python_interpreter.png')]

Other usefull things

# is it a file or directory?
img_dir.is_file()
False
img_dir.is_dir()
True
# file/directory name
flist[0].name
'string_explanation.png'
# file suffix
flist[0].suffix
'.png'
# parent directory
flist[0].parent
PosixPath('images/S01E03')

Info

  • always use “/” in path strings. pathlib will convert those to “\” on Windows
  • be carefull when deleting you will not be asked if you really want to delete
  • in older scripts you might encounter os.path functions because pathlib is only available in Python 3 (a comparison is given in the documentation linked above

Classes and objects

In the examples on the previous slides we used functions with a weird “dot”-notation. In addition it looks like sometimes you need brackets when calling a function and sometimes you don’t.

This has something to do with object oriented programming which is a paradigm tightly coupled with Python.

While we won’t go into the details, it is important to understand the concept and terminology. It will help to grasp whats going on in the above examples and it makes it easier to understand explanations and solutions you might find online

The basic idea about object oriented programming is to group data and functions which can interact with that data into an object

This programming style can make it easier to:

  • structure big and complex software
  • write reusable code
  • maintain code

Two important concepts are:

  • classes which are like a blueprint
  • objects which are instances, the actual realisation, of the corresponding class

(adapted after https://www.slideshare.net/ArslanArshad9/object-oriented-programming-with-python slide 3)

Connecting the dots

So what has that to do with the “dot”-notation in the examples?

Errors and Exceptions

When an error occurs in a Python script, it will stop and generate an error message. This happens, e.g., when you try to read a non-existing or corrupt file:

with open("non_existing_file.txt") as file:
    content = file.read()

    ---------------------------------------------------------------------------

    FileNotFoundError                         Traceback (most recent call last)

    <ipython-input-15-b91d3e0c2f63> in <module>
    ----> 1 with open("non_existing_file.txt") as file:
          2     content = file.read()


    FileNotFoundError: [Errno 2] No such file or directory: 'non_existing_file.txt'


These errors (or “exceptions”) can be handled using a try-except block:

try:
    #code which possibly gives an error
except:
    #code to execute in case of error

If an exception occurs and the except block has run the script does not stop but continues to run.

Note:

While the length of the try block is not limited as soon as an Error is encountered the block will stop (i.e. other code after the line with the error will be skipped)!

Some best practice rules to keep in mind when writing try-except blocks are:

  • always handle specific errors => don’t catch Exception because it silently hides all errors (most of the time you know what might break so catch it specifically)
  • give the user the stack trace in addition to custom text (for example with the built in traceback module, because it contains information where the exception occurred
  • if different errors might come up you can have multiple exception blocks to handle them differently

applied to the example above a refactored code might look like this:

import traceback

filename = "non_existing_file.txt"
try:
    with open(filename, "r") as file:
        content = file.read()
except FileNotFoundError as e:
    print(e) # Print the error
    print("Make sure \"{}\" exists".format(filename)) # Custom message to user
    traceback.print_exc() # give the user the full stacktrace
    
print("Script continues after catching exception instead of exiting!")
[Errno 2] No such file or directory: 'non_existing_file.txt'
Make sure "non_existing_file.txt" exists
Script continues after catching exception instead of exiting!

Task

  1. Have a look at Pythons built in exceptions here and find the one you can use in the next task.
  2. Re-write your add5(x)-function so that it does not kill the program when a wrong argument is passed (e.g. a string like “zehn”) and inform the caller of the function about a raised exception so that she/he can react accordingly.

Solution

  1. Since a value of 5 can not be added to a string the appropiate exception type would be python TypeError
  2. import traceback
    def add5(x):
     try:
         return x + 5
     except ValueError as e:
         print("The input to the function has the wrong data type.")
         print(e)
         traceback.print_exec()
    

Exercise 3