Importing and preparing the data

We will be looking at data from the following countries:

  • Italy
  • Austria
  • Germany
  • Belgium
  • France
  • United Kingdom
  • Portugal

We begin by importing the data, and adding some new features so that we can compare the data from different countries. For example we calculate 'confirmed cases per 100k population', 'deaths per 100k' and 'new cases' since these are not initially in the dataset. This data is collected and preprocessed in this file.

import altair as alt
import pandas as pd

x_small_url = "https://raw.githubusercontent.com/idjotherwise/nlp-otherwise/master/data_sets/european_covid.csv"

Here is a random sample of 5 rows from the dataset.

x_small = pd.read_csv(x_small_url)
x_small.sample(5)
id date vaccines confirmed tests recovered deaths population confirmed_per deaths_per ratio tests_per vaccines_per new_cases new_cases_per
1891 Portugal 2021-09-02 14891181.0 1042322.0 17153567.0 980599.0 17766.0 10283822.0 10135.550771 172.756782 1.704464 166801.477116 1.448020 2830.0 27.518952
4454 United Kingdom 2021-05-07 52433184.0 4429046.0 160342371.0 NaN 127740.0 66460344.0 6664.193613 192.204843 2.884143 241260.218274 0.788939 1885.0 2.836278
4708 Switzerland 2020-03-06 NaN 315.0 NaN NaN 2.0 8513227.0 3.700125 0.023493 0.634921 NaN NaN 73.0 0.857489
3641 Austria 2020-12-19 NaN 336093.0 NaN 297323.0 5628.0 8840521.0 3801.732952 63.661406 1.674537 NaN NaN 1505.0 17.023884
326 Italy 2021-01-15 1125736.0 2352423.0 28734210.0 1713030.0 81325.0 60421760.0 3893.337433 134.595550 3.457074 47556.062584 0.018631 16144.0 26.718851

Plotting the data

We will first look at the total numbers of cases and deaths in each country, before moving on to cases and deaths per 100k population.

In each of the charts below, you can click on the legend to filter the lines shown

Total cases per 100,000

leg_selection = alt.selection_multi(fields=['id'], bind='legend')

alt.Chart(x_small_url).mark_line().encode(
    x=alt.X("yearmonthdate(date):T", axis=alt.Axis(title='Date')),
    y=alt.Y("confirmed_per:Q", axis=alt.Axis(title='Confirmed per 100k')),
    tooltip=['id:N', 'confirmed_per:Q'],
    color=alt.Color('id:N', legend=alt.Legend(title="Countries")),
    opacity=alt.condition(leg_selection, alt.value(1), alt.value(0.2))
).transform_filter(alt.datum.confirmed_per>0).add_selection(leg_selection).properties(title='Total number of cases per 100,000 population for selected European Countries', width=600).interactive()

Total deaths per 100,000

alt.Chart(x_small_url).mark_line().encode(
    x=alt.X("yearmonthdate(date):T", axis=alt.Axis(title='Date')),
    y=alt.Y("deaths_per:Q", axis=alt.Axis(title='Deaths per 100k'), impute=alt.ImputeParams(value=50)),
    tooltip=["id:N", "deaths_per:Q", "yearmonthdate(date):T"],
    color=alt.Color('id:N', legend=alt.Legend(title="Countries")),
    opacity=alt.condition(leg_selection, alt.value(1), alt.value(0.2))
).transform_filter(alt.datum.deaths_per>0).add_selection(leg_selection).properties(title='Number of deaths per 100,000 population for selected European Countries', width=600).interactive()

Two week incidence rate

brush = alt.selection(type='interval', encodings=['x'])

base = alt.Chart(x_small_url).mark_line().transform_filter(alt.datum.new_cases_per>0).transform_window(
    rolling_mean='sum(new_cases_per)',
    frame=[-7, 0],
    groupby=['id:N']
).encode(
    x=alt.X("yearmonthdate(date):T",
            axis=alt.Axis(title='Date')
           ),
    y=alt.Y("rolling_mean:Q",
            axis=alt.Axis(title='Incidence rate')
           ),
    tooltip=['id:N', 'rolling_mean:Q'],
    color=alt.Color('id:N', legend=alt.Legend(title="Countries")),
    opacity=alt.condition(leg_selection, alt.value(1), alt.value(0.2))
).add_selection(leg_selection).properties(
    width=600,
    height=400,
    title='Number of new cases per 100,000 over two weeks for selected countries'
)

upper = base.encode(
    alt.X('yearmonthdate(date):T',axis=alt.Axis(title='Date'),
          scale=alt.Scale(domain=brush))
)

lower = base.properties(
    height=60
).add_selection(brush)

upper & lower

The ratio of confirmed cases and deaths gives an indication of what the case fatality rate is - it seems to be between 2 and 3%, assuming that the countries listed here are catching all positive cases (which they probably aren't, so it's likely lower than this).

Case fatality rate

base = alt.Chart(x_small_url).mark_line().transform_filter(alt.datum.ratio>0).encode(
    x=alt.X("yearmonthdate(date):T", axis=alt.Axis(title='Date')),
    y=alt.Y("ratio:Q", axis=alt.Axis(title='Ratio of deaths per case')),
    tooltip='id:N',
    color=alt.Color('id:N', legend=alt.Legend(title="Countries")),
opacity=alt.condition(leg_selection, alt.value(1), alt.value(0.2))
).add_selection(leg_selection).properties(title='The ratio of deaths to confirmed cases (case fatality rate)', width=600)

upper = base.encode(
    alt.X('yearmonthdate(date):T',axis=alt.Axis(title='Date'),
          scale=alt.Scale(domain=brush))
)

lower = base.properties(
    height=60
).add_selection(brush)

upper & lower

Vaccines

In the chart below we plot the number vaccines given per population - this means that if the number is 1, then the country has given the equivalent of 1 shot for each person in the country. Since not everyone in the countries are eligible to get the vaccine, a ratio of 1 means that many people have recieved two jabs. Note also that some kinds of vaccines (the J&J's Janssen vaccine, for example) only require 1 shot so the goal is not neccesarily to reach exactly 2 shots per person in the whole country.

alt.Chart(x_small_url).mark_line().transform_filter(alt.datum.vaccines_per>0).encode(
    x=alt.X("yearmonthdate(date):T", axis=alt.Axis(title='Date')),
    y=alt.Y("vaccines_per:Q", axis=alt.Axis(title='Number of vaccines given')),
    tooltip=['id:N', 'vaccines_per:Q'],
    color=alt.Color('id:N', legend=alt.Legend(title="Countries")),
    opacity=alt.condition(leg_selection, alt.value(1), alt.value(0.2))
).add_selection(leg_selection).properties(title='Number of vaccines given', width=600).interactive()