Containers#
Learning goals
After finishing this chapter, you are expected to
be able to work with different kinds of containers in Python
understand differences between
tuple
,list
,set
, anddict
containersindex and slice tuples and lists
know how to format strings
Tuples, lists, sets, and dictionaries#
In the previous chapter, we have introduced four basic Python data types: integers, floating point numbers, strings, and booleans. Variables of this type allow the user to store one value, e.g.,
course_name = 'Introduction to programming'
year = 2024
In many practical scenarios, you might want to store multiple values in one variable. For example, when creating an overview of all students in one variable, or when you are working with matrices in mathematics. For this, Python has four very useful data types: tuples, lists, sets and dictionaries. These are called containers, which can be used to store multiple values or variables. There are similarities, but also important differences between these four data types. In this chapter we’ll discuss these.
Tuple#
A tuple is a container for multiple items that is immutable. Immutability means that once a tuple has been created, we cannot change its contents. You recognize a tuple in Python by the use of parentheses ()
. For example, we can define a tuple with three numbers as follows.
digits = (3, 5, 4)
The 3
, 4
and 5
now together form a tuple, and the variable digits
contains this tuple and thus these items. The items in a tuple can have any type. For example, we can also create a tuple of strings
student_names = ('Alice', 'Bob', 'Claire')
or a tuple of booleans
passed_course = (True, False, True)
The last example shows that a tuple can contain the same value multiple times: it’s contents are ordered and they will stay in their order (even if they are not ascending or descending, or alphabetically ordered).
We can access the value of an individual item in the tuple using []
. This is called indexing. For example, to get the value of the first element in the tuple student_names
, we use the index of that element in student_names
and write
print(student_names[0])
Alice
Warning
Note that to get the first element, we use index 0. In most programming languages, we start counting at 0, and not at 1.
If you have a tuple, you can always use indexing to get the value of an individual item. This way, you can also get the value of the second and third item in student_names
print(student_names[1])
print(student_names[2])
Bob
Claire
Note that getting the value of an item in the tuple does not change anything to the tuple itself. The tuple is still the same if we get the value of an item.
print(student_names)
('Alice', 'Bob', 'Claire')
Because a tuple is immutable, we cannot change the values of individual items, or assign a value to one of the elements. Let’s give that a try below. In this line of code, we try to change the name of a student in the tuple.
student_names[1] = 'Bert'
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/var/folders/07/x0gf6dj176dfjvrn357t5p6h0000gn/T/ipykernel_34271/34016667.py in <module>
----> 1 student_names[1] = 'Bert'
TypeError: 'tuple' object does not support item assignment
Indeed, we get an error. Take a close look at this error, it tells you exactly what the problem is: tuples do not allow item assignment.
Instead of indexing from the start of a tuple, you can also index by counting back from the last element by using negative integer numbers, where index -1 refers to the last item, -2 to the second-to-last item, etc.
Exercise 2.1
Why do student_names[0]
and student_names[-3]
refer to the same item?
In addition to selecting a single item using indexing, we can also select multiple items at once. This is called slicing. Slicing works with start, end, and step indices. These are used in the following syntax
sub_items = some_tuple[start:end:step]
In the example below, we select all items starting at index 3 and until (but not including) index 5. Note that in this example, the step is not provided. In this case, Python assumes that you want to use a step size of 1.
food = ('spam', 'egg', 'bacon', 'tomato', 'ham', 'lobster')
print(food[3:5])
('tomato', 'ham')
If you only provide an end index, Python assumes that you mean to slice up to that index. Conversely, if you only provide a start index, Python assumes that you intend to slice the full tuple from that index on. This is shown in the following two examples.
print(food[:4])
print(food[2:])
('spam', 'egg', 'bacon', 'tomato')
('bacon', 'tomato', 'ham', 'lobster')
If the start and end index are both undefined, we are simply selecting all items in the items
print(food[:])
('spam', 'egg', 'bacon', 'tomato', 'ham', 'lobster')
The step
lets us select only a subset of items. For example, to select every second item in the tuple between indices 2 and 6, the following code can be used, in which the step is 2.
print(food[2:6:2])
('bacon', 'ham')
Steps can also be negative, in which case the tuple is traversed in the reverse direction. Here, it is important that you pick your start and end indices correctly. For example, the following code will not return anything.
print(food[2:6:-1])
()
Exercise 2.2
Why does this code not print anything?
Instead, to traverse the tuple in reverse, the start index should be higher than the end index:
print(food[6:2:-1])
('lobster', 'ham', 'tomato')
This also provides an easy way to flip the items in a tuple. Remember that by not providing a start and end, we select the whole tuple. By then providing a negative step size, we ask Python to provide the full tuple in reverse.
print(food[::-1])
('lobster', 'ham', 'tomato', 'bacon', 'egg', 'spam')
It’s often useful to know the length of a tuple, or the number of items in the tuple. For this, we can use the len
function.
print(len(food))
6
Items in a tuple can be variables, and tuples can contain other tuples. By nesting your indices, you can access the individual items in the nested tuple. Similarly, you can slice items in your tuple.
vegetables = ('tomato', 'eggplant')
groceries = ('cheese', 'bread', vegetables, 'milk')
print(groceries)
print(groceries[2][1])
('cheese', 'bread', ('tomato', 'eggplant'), 'milk')
eggplant
List#
Lists are the most commonly used container objects in Python. Like a tuple, a list can contain multiple items. Also similarly to a tuple, these items are ordered. Different than in a tuple, a list is mutable: we can add, change, and remove items. A list is defined using square brackets []
.
digits = [3, 5, 4]
print(digits)
[3, 5, 4]
To illustrate what we mean with mutable, let’s use indexing to assign a new value to the second element
digits[1] = 6
print(digits)
[3, 6, 4]
Note how - in contrast to the tuple - we don’t get an error in this case!
We can not only change values of items in the list, but also add items or remove items. To add a single item to a list, you can use the append
function, as in the following example
digits.append(6)
print(digits)
[3, 6, 4, 6]
If you want to add multiple items, you can use the extend
function as follows
digits.extend([7, 8])
print(digits)
[3, 6, 4, 6, 7, 8]
Equivalently, using the assignment operator +=
that you learned last week, you can achieve the same result
digits += [9, 10]
print(digits)
[3, 6, 4, 6, 7, 8, 9, 10]
Note that addition for two lists is the same as concatenation.
You can also make a list ‘from scratch’ by just starting with an empty list and adding items as you go, like this:
student_names = []
student_names.append('Alice')
student_names.append('Bob')
student_names += ['Clark', 'Denise']
print(student_names)
['Alice', 'Bob', 'Clark', 'Denise']
Indexing and slicing
Indexing and slicing work the same way for lists and tuples. The length of a list can also be determined using len
. E.g.,
cheeses = ['gouda', 'brie', 'mozarella']
list_length = len(cheeses)
There are two ways to remove items from a list. First, the pop
method removes an item from a list at a given index and returns its value. Second, the remove
method removes an item based on its value. The code below demonstrates this.
fruits = ['apple', 'banana', 'cherry']
popped = fruits.pop(1)
print(fruits)
print(popped)
fruits.remove('cherry')
print(fruits)
['apple', 'cherry']
banana
['apple']
Note that lists allow multiple items to have the same value and remove
only removes the first matching occurence.
motto = ['hodor', 'hodor', 'hodor']
motto.remove('hodor')
print(motto)
['hodor', 'hodor']
Exercise 2.3
Write scripts for the following tasks:
Use indexing and assignment to sort the items in the list
scrambled_numbers = [7, 2, 5, 1]
in ascending order.Split the list
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
into a list of even numbers and a list of uneven numbers using slicing.Given list
l = [10, 20, 30, 40, 'a', 'dog', 3.4, True]
, make two new lists: one containing every second item counting from the back, and one containing every third item starting from the front.Given tuple
t = ("Orange", [10, 20, 30], (5, 15, 25))
, write code to extract the value15
fromt
.
Set#
A set is a container that is unordered and does not allow duplicate values. Sets cannot contain multiple items with the same value. Sets are defined using curly brackets {}
. In the example below, you can note two things
Although we assign two
5
s to the set, the set only contains one5
. The reason for this is that a set does not allow duplicate values.Although the
4
comes after the5
in our assignment, the order is swapped to be increasing in the eventual set. This is the case because a set is not ordered: elements are always sorted in ascending order for printing.
digits = {3, 5, 5, 4}
print(digits)
{3, 4, 5}
You will find that sets can be very useful when you want to perform common mathematical operations like unions and intersections. Sets have some built-in functions that allow you do this. For example, if we have two groups of people, we can use sets for boolean operators.
team_a = {'Adam', 'Bob', 'Claire', 'Dean'}
team_b = {'Bob', 'Dean', 'Edward', 'Fran'}
print(team_a.intersection(team_b)) # Who is in both groups (intersection)
print(team_a.union(team_b)) # All people (union)
print(team_a.difference(team_b)) # Everyone who is uniquely in group a
{'Dean', 'Bob'}
{'Bob', 'Adam', 'Claire', 'Dean', 'Edward', 'Fran'}
{'Adam', 'Claire'}
Just like in tuples, you cannot assign values to indices in sets. However, you can add and remove elements from sets using the add
and remove
functions, as follows.
team_a.add('Edward')
print(team_a)
{'Bob', 'Adam', 'Claire', 'Dean', 'Edward'}
Dictionary#
A dictionary is a collection that contains items which are pairs of keys and values. As with lists, tuples and sets, a dictionary can contain items with different data types. Keys do not necessarily have a specific type, and neither do values. Dictionaries ar defined using curly brackets {}
and keys and values are separated using a colon :
. Each item in a dictionary should have a key and a value.
car = {"brand": "Ford", "model": "Mustang", "year": 1964}
print(car)
{'brand': 'Ford', 'model': 'Mustang', 'year': 1964}
Single or double quotes
Some programming languages distinguish between using ‘single quotes’ and “double quotes”, but Python does not. You’re free to choose what you prefer. As you can see above, even if you use double quotes, Python will turn these into single quotes.
Dictionaries are very useful in many applications that you will see during this course. Note that we cannot use indexing and slicing in dictionaries, but we can get values in dictionaries by using keys. For example, if the dictionary above defines a car, we can use car['brand']
to get the brand of this car.
print(car['brand'])
Ford
If you want to add something to an existing dictionary, you can simply assign a new key and value pair as follows
car['color'] = 'Blue'
print(car)
{'brand': 'Ford', 'model': 'Mustang', 'year': 1964, 'color': 'Blue'}
To remove an key-value pair from a dictionary, use del
as follows.
del car['model']
print(car)
{'brand': 'Ford', 'year': 1964, 'color': 'Blue'}
Note that dictionary keys and values can have any type. They don’t have to be strings or integers. This makes dictionaries very flexible and useful in practice. However, there are some restrictions that you should keep in mind:
A key can only appears in a dictionary once, duplicate keys are not allowed.
If you assign a new value to a key, it overwrites the old value.
Converting between collections#
If your data is organized in a set, you can turn it into a list or vice versa. Remember how in the previous chapter we could cast an integer as a float and vice versa? We can do the same with containers by changing their type. However, keep in mind the characteristics of these data types. For example, if you have a list with duplicate values and you turn that into a set, you’ll lose the duplicates. This can be convenient, for example if you want to automatically select all unique values in a list. In the code below, notice how the set contains all unique elements in the list, ordered in ascending order, while the tuple maintains the original order and preserves duplicates.
digits_list = [1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 5]
print(set(digits_list)) # Cast digits_list to a set
print(tuple(digits_list)) # Cast digits_list to a tuple
{1, 2, 3, 4, 5}
(1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 4, 5)
Dictionaries cannot be directly transformed into lists, sets, or tuples, as only the keys will be preserved. However, you can get a list of either the keys or values in a dictionary using the keys()
and values()
methods
car = {"brand": "Ford", "model": "Mustang", "year": 1964}
print(list(car.keys()))
print(list(car.values()))
['brand', 'model', 'year']
['Ford', 'Mustang', 1964]
To create a new dictionary from two lists (one containing keys, one values), use the zip
function. This function takes two lists and, like a zipper, iterates over pairs of items in both lists. Each pair is combined in a tuple, and a new list is made using all these tuples. Using the dict
function, this list can be turned into a dictionary. The following code illustrates this.
keys = ['brand', 'model', 'year']
values = ['Ford', 'Mustang', 1964]
dict(list(zip(keys, values)))
{'brand': 'Ford', 'model': 'Mustang', 'year': 1964}
Exercise 2.4
Write scripts for the following tasks:
Given the below dictionary, write a line of code that outputs Mike’s grade for history.
sample_dict = {
"class": {
"student": {
"name": "Mike",
"grades": {
"physics": 70,
"history": 80
}
}
}
}
Given the below dictionary, raise the salary of Brad to 6800.
sample_dict = {
'emp1': {'name': 'Jhon', 'salary': 7500},
'emp2': {'name': 'Emma', 'salary': 8000},
'emp3': {'name': 'Brad', 'salary': 500}
}
Define a dictionary called
student
that contains your own name, hair color, and age.Create a list containing yourself and two additional students as dictionaries.
The in
operator#
For tuples, lists, sets, and dictionaries, Python provides the in
operator, which checks whether a value is present in the collection. If so, it returns True
, if not it returns False
.
digits = [1, 2, 3, 4]
print(1 in digits)
print(1 in (set(digits)))
print(1 in tuple(digits))
digits_dict = {1: 4, 2: 5}
print(1 in digits_dict)
True
True
True
True
This operator can also be easily combined with not
to ask for the opposite response. Consider the following example.
car = {"brand": "Ford", "model": "Mustang", "year": 1964}
'brand' not in car
False
In addition to tuples, lists, sets and dictionaries, the in
operator also works for strings. For a string, the in
operator checks whether a character or substring is part of that string. For example:
my_string = 'supercalifragilisticexpialidocious'
print('i' in my_string)
print('califragi' in my_string)
print('z' in my_string)
True
True
False
Containers: summary#
There are some important differences between tuples, lists, sets and dictonaries. The following table (source) nicely summarizes the differences between tuples, lists, sets and dictionaries. We’ll elaborate below.
Tuple |
List |
Set |
Dictionary |
---|---|---|---|
A tuple is a non-homogeneous data structure that stores elements in columns of a single row or multiple rows. |
A list is a non-homogeneous data structure that stores the elements in columns of a single row or multiple rows. |
The set data structure is also a non-homogeneous data structure but stores the elements in a single row. |
A dictionary is also a non-homogeneous data structure that stores key-value pairs. |
Tuple can be represented by ( ) |
The list can be represented by [ ] |
The set can be represented by { } |
The dictionary can be represented by { } |
Tuple allows duplicate elements |
The list allows duplicate elements |
The Set will not allow duplicate elements |
The dictionary doesn’t allow duplicate keys. |
Tuple can use nested among all |
The list can use nested among all |
The set can use nested among all |
The dictionary can use nested among all |
Example: |
Example: |
Example: |
Example: |
Tuple can be created using the |
A list can be created using the |
A set can be created using the |
A dictionary can be created using the |
A tuple is immutable i.e we can not make any changes in the tuple. |
A list is mutable i.e we can make any changes in the list. |
A set is mutable i.e we can make any changes in the set, but elements are not duplicated. |
A dictionary is mutable, but keys are not duplicated. |
Tuple is ordered |
List is ordered |
Set is unordered |
Dictionary is ordered (Python 3.7 and above) |
Creating an empty Tuple |
Creating an empty list |
Creating a set |
Creating an empty dictionary |
String formatting#
Oftentimes when we want to print something in a Python script, we want to include variable values in the print statement. Python formatting uses the format
function, which is a very powerful way to make your print statements more dynamic. Consider the following example.
a = 45
b = 76
print('The sum of {} and {} is {}'.format(a, b, a+b))
The sum of 45 and 76 is 121
Changing the values of the variables a
and b
also changes the output of the print statement. To make sure that the right values end up in the right place, you can also name the formatted parts of your string, as follows
a = 45
b = 76
print('The sum of {number_a} and {number_b} is {sum_numbers}'.format(number_a=a, number_b=b, sum_numbers=a+b))
The sum of 45 and 76 is 121
If you have a floating point number with many decimals, you can also control the number of decimals that actually get printed, as follows.
a = 1/9
print(a) # Prints the 'raw' value of a with many decimals
print('{:.2f}'.format(a)) # Only prints two decimals
print('{number:.2f}'.format(number=a)) # Same as in previous line, but now with named argument
0.1111111111111111
0.11
0.11
Exercise 2.5
The table below shows the average temperature in Enschede per month.
Average |
January |
February |
March |
April |
May |
June |
July |
August |
September |
October |
November |
December |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Temp. |
2°C |
3°C |
6°C |
9°C |
13°C |
16°C |
18°C |
17°C |
14°C |
10°C |
6°C |
3°C |
Use this information for the following:
Put this information in a dictionary, use strings as keys and integers as values.
Then, use the dictionary and print the following statements. Use
format
for the italic parts in‘The average temperature in March is 6°C’
‘The average temperature in August is 17°C’
‘The average temperature in July is 15°C higher than in February’