Working with unordered structures
Date: 16/11/2020
Time: 09:30-11:30
Sets
Why do we need it?
What else should I know?
a_set = {"Roma", "Torino", "Bologna"}
# create a set from a list
a_set = set(["Roma", "Torino", "Bologna","Roma"])
# add an item to the set
a_set.add("Palermo")
# remove an item from the set
a_set.remove("Palermo") # raise an error if the item isn't found
a_set.discard("Palermo") # doesn't raise errors if the item isn't found
#ERROR: we can't add a mutable item to the set
a_set.add(["Rimini","Firenze"])
# some operations: union, intersection, difference ... etc
b_set = {"Napoli", "Bari", "Lecce", "Roma"}
new_set = a_set.union(b_set)
#OUTPUT: {'Bari', 'Bologna', 'Lecce', 'Napoli', 'Roma', 'Torino'}
new_set = a_set.intersection(b_set)
#OUTPUT: {'Roma'}
new_set = a_set.difference(b_set)
#OUTPUT: {'Bologna', 'Torino'}
Dictionaries
Why do we need it?
What else should I know?
ages_dict = {}
# add a new (key,value) pair
ages_dict["Marco"] = 25
ages_dict["Alessia"] = 22
ages_dict["Giulia"] = 21
#OUTPUT: {'Marco': 25, 'Alessia':22, 'Giulia':21}
# accessing an item
print(ages_dict["Marco"])
# remove an item from the dictionary
del ages_dict["Marco"]
# important methods
ages_dict.items() #returns a sequence of (key,value) pairs
#OUTPUT: [('Alessia',22),('Giulia',21)]
a_dict = {"Pippo":34}
ages_dict.update(a_dict) #updates ages_dict with the (key,value) pairs of a_dict
#OUTPUT: {'Alessia': 22, 'Giulia':21, 'Pippo':34}
Exercises
(check the exercises on the github repository)
1st Exercise
We define the variable lyrics
containing the lyrics (string value) of the song "Lonely Boy" of "The Black Keys". The words are all written in lowercase and the lines are separated by ;;
a) We want to print all the unique words in the lyrics of the song "Lonely Boy". We also want to exclude the following words from the final set: ['', 'a', 'i', 'am', 'to', ';;', 'the', 'you', 'don’t', 'and', 'that', 'i’m', 'it’s']
. Define a function clean_lyrics()
which takes lyrics
as parameter and returns a clean set of words (as just described). Call the defined function and print the new returned set.
On python
{string}.split({separator})
splits a string into a list of words using {separator} as splitter between the words in the stringExample:
Calling
"Hi my name is James".split(" ")
returns the following list ["Hi", "my", "name", "is", "James"]
lyrics_set = set(txt_lyrics.split(" "))
unwated_list = ['', 'a', 'i', 'am', 'to', ';;', 'the', 'you', 'don’t', 'and', 'that', 'i’m', 'it’s']
unwanted_set = set(unwated_list)
clean_set = lyrics_set.difference(unwanted_set)
return clean_set
my_set = clean_lyrics(lyrics)
print(my_set)
b) Define a function common_words()
which takes the clean version of lyrics
(result of point (a)) as parameter. The function should count and return the number of words that are also part of the following list ["mama","daddy","sister","brother","boy","girl"]
.
l_words = ["mama","daddy","sister","brother","boy","girl"]
s_words = set(l_words)
common_set = clean_set.intersection(s_words)
return len(common_set)
print(common_words(my_set))
2nd Exercise
We want to further analyse the lyrics of the 1st Exercise considering the same variable lyrics
.
a) Define a function count_words()
which takes lyrics
as parameter and returns a dictionary of all the words with a corresponding number to indicate the count of the occurrences in the song lyrics. The dictionary should not consider and contain the following words ['', 'a', 'i', 'am', 'to', ';;', 'the', 'you', 'don’t', 'and', 'that', 'i’m', 'it’s']
.
result_dict = {}
lyrics_l = txt_lyrics.split(" ")
unwated_list = ['', 'a', 'i', 'am', 'to', ';;', 'the', 'you', 'don’t', 'and', 'that', 'i’m', 'it’s']
for w in lyrics_l:
if w not in unwated_list:
if w not in result_dict:
result_dict[w] = 0
result_dict[w] += 1
return result_dict
count_dict = count_words(lyrics)
print(count_dict)
b) Andrea wants to build a clever organization for its playlist of "The Black Keys". He used to write first the name of the album followed by the title of the song and separating the two values using "::" (e.g. el_camino::lonely_boy
the album name is "el_camino" while "lonely_boy" is the song title). Here we have the entire playlist of Andrea, the songs are separated using ";;":
Define a function build_playlist_dict()
which takes playlist_txt
as a parameter and creates a dictionary having the album titles as keys while for each key (album) the dictionary associates a list of all its corresponding songs.
def build_playlist_dict(a_txt):
result_dict = {}
songs = a_txt.split(" ;; ")
for a_song in songs:
song_parts = a_song.split("::")
album = song_parts[0]
song_name = song_parts[1]
if album not in result_dict:
result_dict[album] = []
result_dict[album].append(song_name)
return result_dict
print(build_playlist_dict(playlist_txt))