COVID-19 Data Exploration (Unfinished)

Exploration and visualization of COVID-19 related data

Posted by Zekun on March 19, 2020

Note: The content of this article is based on the data what was collected by the project team of Professor Ziqi Chen, School of Mathematics and Statistics, Central South University(China). The data used for this analysis are not complete.

10-import

Load the necessary packages.

library(tidyverse)
library(ggplot2)
library(gganimate)
library(plotly)
library(readxl)
library(lubridate)
library(gifski)

Read data from Excel.

abroad_raw <- read_excel("data/【海外】疫情数据.xlsx")

Convert wide tables to long tables and process date data into a usable form.

data <- abroad_raw %>%
  select(-name) %>%
  pivot_longer(
    -country,
    names_to = "date",
    values_to = "number",
    values_drop_na = TRUE
  ) %>%
  mutate(date = as.integer(date)) %>%
  mutate(date = as.Date(date, origin = "1899-12-30"))

20-exploratory-data-analysis

line <- data %>%
  ggplot(aes(x = date, y = number, group = country)) +
  geom_line(aes(color = country), size = 2, alpha = 0.5) +
  geom_point(aes(color = country), size = 3) +
  geom_text(
    aes(label = country),
    #color = "black",
    check_overlap = TRUE,
    hjust = 0,
    nudge_x = 0.3
  ) +
  ggdark::dark_theme_bw() +
  theme(plot.title = element_text(face = "bold", size = 20)) +
  guides(color = FALSE) +
  labs(
    title = paste0('COVID-19 non-China cases at ', '{frame_along}'),
    y = "Case number",
    x = "Date"
  ) +
  transition_reveal(date) +
  view_follow() +
  ease_aes(default = 'linear')

animate(
  line,
  width = 1000,
  height = 500,
  renderer = gifski_renderer(),
  duration = 20,
  rewind = FALSE,
  end_pause = 1
)
anim_save("output.gif")

Project GitHub page: COVID-19-EDA