European Covid data exploration
Exploring which countries have had the highest and lowest covid numbers in Europe
Importing and preparing the data
We will be looking at data from the following countries:
- Italy
- Austria
- Germany
- Belgium
- France
- United Kingdom
We begin by importing the data, and adding calculating some new features so that we can compare the data from different countries. For example we calculate 'confirmed cases per 100k population', 'deaths per 100k' and 'new cases' since these are not initially in the dataset.
from covid19dh import covid19
import altair as alt
import datetime
countries = ["Italy",
"Austria",
"Germany",
"Belgium",
"France",
"United Kingdom",
"Switzerland"
]
yesterday = datetime.date.today() - datetime.timedelta(days=1)
x, src = covid19(countries, raw=True, verbose=False, end=yesterday, cache=False)
x_small = x.loc[:, ['administrative_area_level_1', 'date', 'vaccines', 'confirmed','tests', 'recovered', 'deaths', 'population']]
x_small.rename(columns={'administrative_area_level_1': 'id'}, inplace=True)
x_small['confirmed_per'] = 100000 * x_small['confirmed'] / x_small['population']
x_small['deaths_per'] = 100000 * x_small['deaths'] / x_small['population']
x_small['ratio'] = 100 * (x_small['deaths']) / (x_small['confirmed'])
x_small['tests_per'] = 100000 * (x_small['tests']) / (x_small['population'])
x_small['vaccines_per'] = x_small['vaccines'] / x_small['population']
x_small['new_cases']=x_small.groupby('id').confirmed.diff().fillna(0)
x_small['new_cases_per']=x_small.groupby('id').confirmed_per.diff().fillna(0)
Here is a random sample of 5 rows from the dataset.
x_small.tail()
We will first look at the total numbers of cases and deaths in each country, before moving on to cases and deaths per 100k population.
In each of the charts below, you can click on the legend to filter the lines shown
leg_selection = alt.selection_multi(fields=['id'], bind='legend')
alt.Chart(x_small).mark_line().encode(
x=alt.X("yearmonthdate(date):T", axis=alt.Axis(title='Date')),
y=alt.Y("confirmed_per:Q", axis=alt.Axis(title='Confirmed per 100k')),
tooltip=['id', 'confirmed_per'],
color=alt.Color('id', legend=alt.Legend(title="Countries")),
opacity=alt.condition(leg_selection, alt.value(1), alt.value(0.2))
).add_selection(leg_selection).properties(title='Total number of cases per 100,000 population for selected European Countries', width=600).interactive()