Data analysis of the Netflix shows - part 3
Date: 11/12/2020
Time: 12:30-14:30
Data visualization
- What is the nature of our data?
- What aspects we want to analyse?
- What are the most suitable graphical elements we can use to present our analysis?

The matplotlib library
matplotlib was the first Python data visualization library and it's still widely used for plotting in the Python community. It was designed to closely resemble MATLAB, a popular proprietary programming language.
We are especially interested in matplotlib.pyplot
, a collection of plotting functions. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.
Note: this is not a builtin library of python so we need to install it using pip install matplotlib
data = [23,85, 72, 43, 52]
labels = ['A', 'B', 'C', 'D', 'E']
# x-Axis ticks and label
plt.xticks(range(len(data)), labels)
plt.xlabel('Class')
# y-Axis label
plt.ylabel('Amounts')
# chart title
plt.title('I am title')
# plt a bar
plt.bar(range(len(data)), data)
plt.show(
After running the above script this chart should appear:
Data visualizations on the Netflix shows dataset (see also the github repository)
To answer the following exercises we need to use some of the functions we have defined on the previous lessons (part-1, and part-2).
a) Draw a graphic using matplotlib
which plots the total number of shows (all type of shows) that Netflix added for each different year.
b) Draw a graphic using matplotlib
which plots the average time (in years) it takes Netflix to add a show on its list after its actual release. Plot this value for each different year.
Hint: Take a look at the line charts of matplotlib https://datatofish.com/line-chart-python-matplotlib/.