Python Basics II

Sequential datatypes

Strings

Often, you don’t want to work with single variables but with a whole list of variables. For this, you can use sequential datatypes in Python. Actually, you already got to know one of these sequential datatypes which is the string datatype:

this_is_a_string = "I am a string. But I am also a sequence of characters."

A string consists of a sequence of characters. Individual characters in a string can be accessed by specifying the string name followed by a number in square brackets ([ ]).

zeichenkette = "Remote Sensing"
zeichenkette[5]
'e'

You can also use negative indexes. Negative indexes start at the last character and go backwards:

zeichenkette[-5]
'n'

You can also select multiple characters at the same time (which is called ‘slicing’):

zeichenkette[7:14]
'Sensing'

You can check the length of a sequential datatype using the len() function. Note that the length of an object is always last index + 1 because in Python (contrary to R) indexes in sequential datatype objects start with 0.

len(zeichenkette)
14

If you specify an index that is out of the range of the sequential datatype object, an error is returned:

zeichenkette[14]

    ---------------------------------------------------------------------------

    IndexError                                Traceback (most recent call last)

    <ipython-input-5-be7520983767> in <module>()
    ----> 1 zeichenkette[14]
    

    IndexError: string index out of range


Strings in Python offer many functions, e.g.:

zeichenkette.upper() # Capitalize all letters in the string
'REMOTE SENSING'
zeichenkette.count("e") # Count the occurrences of "e" in the string
3

You can find all available functions in the Python docs: https://docs.python.org/3.7/library/stdtypes.html#string-methods

Task

  1. Assign your name (first+last) as a string to a variable.
  2. Create a new variable and assign it only your firstname using slicing.
  3. How many times does the letter “e” occur in your full name? Use a Python built in string function for this task.

Solution

  1. name = "Michael Jackson"
    
  2. vorname = name[:7]
    
  3. name.count("e")
    

Lists

Lists in Python are mutable sequences that store multiple objects:

my_list = [1,2,3,4,5]  # The easiest way to create lists is using the [ ] symbols
print(my_list)
[1, 2, 3, 4, 5]

The datatype of the objects is normally the same over the complete list but that’s not obligatory:

my_list = [90,3.1415,"I am a string"] # List with an int, float and a string
print(my_list)
[90, 3.1415, 'I am a string']

To access an element in a list, you can use it’s index (slicing also works with lists):

my_list[2]
'I am a string'

Lists also feature many built-in functions, e.g. the index method which returns the first index at which a value occurs:

zahlenreihe = [8,2,3,4,1,2,3,9,6,7,8,0,4]
zahlenreihe.index(4)
3

Lists can be sorted:

zahlenreihe.sort(reverse=True)
zahlenreihe
[9, 8, 8, 7, 6, 4, 4, 3, 3, 2, 2, 1, 0]

And similar to strings, lists also have a length:

len(zahlenreihe)
13

You can append and remove elements to/from a list:

zahlenreihe = [8,2,3,4,1,2,3,9,6,7,8,0,4]


#zahlenreihe.append(0)  # Appends the given element to the end of a list
#print(zahlenreihe)
zahlenreihe.remove(3)  # Removes the first element with the given value
print(zahlenreihe)
zahlenreihe.pop(-1)    # Removes the element at the given index
print(zahlenreihe)
zahlenreihe[3] = 999   # Exchanges element 3 in the list with 999
print(zahlenreihe)
[8, 2, 4, 1, 2, 3, 9, 6, 7, 8, 0, 4]
[8, 2, 4, 1, 2, 3, 9, 6, 7, 8, 0]
[8, 2, 4, 999, 2, 3, 9, 6, 7, 8, 0]

Task

  1. Create a string variable and assign it the following string:
    "Zehn zahme Ziegen zogen zehn Zentner Zucker zum Zoo"
    
  2. Split the string and put each word as an element into a list.
  3. Count the words.
  4. Sort the list and save it in a new variable called list_sorted.
  5. Capitalize all letters in the 4th list element.
  6. Exchange the 4th list element by the number 99.

Solution

  1. s = "Zehn zahme Ziegen zogen zehn Zentner Zucker zum Zoo"
    
  2. wortliste = s.split(" ")
    
  3. len(wortliste)
    
  4. list_sorted = wortliste[:]
    list_sorted.sort()
    
  5. list_sorted[3].upper()
    
  6. list_sorted[3] = 99
    

Dictionaries

Another useful datatype are dictionaries. In contrast to lists, where you have to keep track of a specific index to be able to access a certain value, dictionaries use keywords. A dictionary can be created by using curly braces { } and each element in it is a key : value pair.

kitchen_supplies = {"pans": 3, 
                    "pots": 4, 
                    "knives": 10, 
                    "spoons": 12, 
                    "forks": 6}
print(kitchen_supplies)
{'pans': 3, 'pots': 4, 'knives': 10, 'spoons': 12, 'forks': 6}

Accessing values is done by specifying the name of the key:

kitchen_supplies["knives"]
10

Similarly there is a get() function which can be used to access values. The advantage over just using the [ ] method to access values is that the get() method returns the None value when you are trying to access elements which do not exist. The square brackets give an error. This can be usefull if you don’t know in advance if an element exists or not.

print(kitchen_supplies["mixers"])

    ---------------------------------------------------------------------------

    KeyError                                  Traceback (most recent call last)

    <ipython-input-5-e817ff785c76> in <module>()
    ----> 1 print(kitchen_supplies["mixers"])
    

    KeyError: 'mixers'


print(kitchen_supplies.get("mixers"))
None

While in this example strings were used as keys and integers used as values this is not a neccessity. Any datatype can be used for values but the keys can only be of immutable type (meaning types which can not be changed afterwards like strings and numbers) and have to be unique. So a list can not be a key but can be a value.

Like lists, dictionaries have some built in methods (functions) which are usefull like:

Checking if a key exists in the dictionary:

10 in kitchen_supplies
False

Remove an item while returning the value at the same time:

num_pans = kitchen_supplies.pop("pans")
print(kitchen_supplies)
print(num_pans)
{'pots': 4, 'knives': 10, 'spoons': 12, 'forks': 6}
3

If you try to pop an element which does not exist, an error will the thrown.

kitchen_supplies.pop("mixers")

    ---------------------------------------------------------------------------

    KeyError                                  Traceback (most recent call last)

    <ipython-input-9-d2407b65b724> in <module>()
    ----> 1 kitchen_supplies.pop("mixers")
    

    KeyError: 'mixers'


Fortunately the pop method accepts a second argument where a default return value can be specified:

kitchen_supplies.pop("mixers", 0)
0

Of course, you can also add key/value-pairs to a dictionary after its instantiation:

kitchen_supplies["mixers"] = 3
kitchen_supplies
{'forks': 6, 'knives': 10, 'mixers': 3, 'pots': 4, 'spoons': 12}

You can get a list of all keys, values or items in the dictionary via the respective dict methods:

print(kitchen_supplies.keys())
print(kitchen_supplies.values())
print(kitchen_supplies.items())
dict_keys(['pots', 'knives', 'spoons', 'forks', 'mixers'])
dict_values([4, 10, 12, 6, 3])
dict_items([('pots', 4), ('knives', 10), ('spoons', 12), ('forks', 6), ('mixers', 3)])

Loops

The for-loop

Thus far, we gave our computers assignments for single variables. Often, however, you want a task to be executed multiple times on many elements (e.g. do the same calculation for a list of satellite images).

Lets say you want to print the numbers 1 to 9 below each other. With our existing tools we would probably do this:

print(1)
print(2)
print(3)
print(4)
print(5)
print(6)
print(7)
print(8)
print(9)
1
2
3
4
5
6
7
8
9

This, however, results in a lot of code duplication (programmers hate duplication). Also, what would you do, if you want to print the numbers 1 to 999999 below each other?

Thankfully, Python provides us with the for-loop functionality. With a loop we can execute a set of statements, once for each item in a list or any other sequential datatype.

The above code can thus be shortened to:

zahlenliste = [1,2,3,4,5,6,7,8,9]

for zahl in zahlenliste:
    print(zahl)
1
2
3
4
5
6
7
8
9

The range()-Method is another basic Python function that creates a list from a given start, stop and increment value. The code above can thus further be shortened to:

for i in range(1,10,1):
    print(i)
1
2
3
4
5
6
7
8
9

Note, that there is an indentation before the print statement. This is because Python blocks (a block is a group of statements) are separated from each other by using indentation (rather than brackets like many other languages). In Python it is custom to use an indentation of 4 spaces.

The following code snippet thus gives an error because after a for-loop-statement Python expects a new block (the content of the loop):

for i in range(1,10,1):
print(i)

      File "<ipython-input-55-5940707a050c>", line 2
        print(i)
            ^
    IndentationError: expected an indented block



For-loops can also be applied to lists of strings:

my_color_list = ["red","green","blue","yellow","black","white"]
for Farbe in my_color_list:
    print(Farbe)
red
green
blue
yellow
black
white

In fact, for-loops are used for iterating over any sequence type. That means they can also be used to iterate over strings and dictionaries:

uni = "Universität Marburg"
for Buchstabe in uni:
    print(Buchstabe)
U
n
i
v
e
r
s
i
t
ä
t
 
M
a
r
b
u
r
g

for key, val in kitchen_supplies.items():
    print("There are {} {} in the kitchen.".format(val, key))
There are 4 pots in the kitchen.
There are 10 knives in the kitchen.
There are 12 spoons in the kitchen.
There are 6 forks in the kitchen.

Loops can also be nested:

for i in [2010,2011]:
    print()
    for j in ["Jan","Feb","Mar"]:
        print(i, j)

2010 Jan
2010 Feb
2010 Mar

2011 Jan
2011 Feb
2011 Mar

Task

  1. Print the following sequence of numbers below each other: 1 3 5 7 9 11 13 15 17 19

Solution

  1. for i in range(1,20,2):
        print(i)
    

Conditional cases

if/else statements

In many cases, you want to execute code only if a specific condition is met:

a = 10
b = 5

if a>b:
    print("a ist größer als b.")
a ist größer als b.

Sometimes it’s also important to deal with those cases where the condition is not met:

a = 10
b = 500

if a>b:
    print("a ist größer als b.")
else:
    print("a ist kleiner oder gleich b.")
a ist kleiner oder gleich b.

You can also check for multiple cases using the elif statement:

a = 10
b = 10


if a>b:
    print("a ist größer als b.")
elif a<b:
    print("a ist kleiner als b.")
elif a==b:
    print("a ist gleich b.")
a ist gleich b.

Of course, in these examples you know the result before you execute the code. The above snippet will always evaluate to True in the last else block and will consequently always print

"a ist gleich b." 

In the real world, however, you are often confronted with situations where you do not know the value of a variable while you are writing your code. For example, when you are dealing with input values from a person using your program:

answer = input("Are you a student? ")

if answer == "yes":
    print("Welcome!")
elif answer == "no":
    print("Your are welcome anyway!")
else:
    print("I did not understand your answer!")
Are you a student? Hallo
I did not understand your answer!

Loops and conditional cases can also be used in combination:

my_number_list = [1,5,9,6,43,5,8,9,23,12,5,8,598,48,26,12,15,26,7,59,659]

for Zahl in my_number_list:
    if Zahl > 500:
        print(Zahl)
598
659

Control flow statements

break & continue

You can stop the execution of a loop with the break statement:

print("Sie haben 10 Versuche!")
for x in range(10):
    y = input("Wie heißt unsere Bundeskanzlerin? ")
    
    if y == "Angela Merkel":
        print("Richtig!")
        break
        
    print("Das war falsch!")
Sie haben 10 Versuche!
Wie heißt unsere Bundeskanzlerin? Gerhard Schröder
Das war falsch!
Wie heißt unsere Bundeskanzlerin? Angela Merkel
Richtig!

There is also the continue statement. It is used to skip the rest of the code in the current loop block and to continue to the next iteration of the loop.

Task

  1. Print each element in the following list, that is divisible by 17:
    zahlenliste = [15,34,67,17,436,234,204,568,6787,1054]
    

Solution

  1. zahlenliste = [15,34,67,17,436,234,204,568,6787,1054]
    for zahl in zahlenliste:
        if zahl%17 == 0:
            print(zahl)
    

Exercise 2