Covid 19
Covid 19
net/publication/348404372
CITATIONS READS
0 3,259
1 author:
Erika Diaz
Rowan University
5 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Erika Diaz on 12 January 2021.
Overview
○
○ Aggregation Terminologies
○ Worldwide
○ United States vs Other Countries
○ U.S. States
● Time Series Forecast
○ Algorithms
○ Predicting USA Cases
○ Predicting NJ Cases and Deaths
● Conclusion
Introduction
● Ever heard of smart terms such as “quantum physics” being used as a common
phrase in movies when talking about something highly intelligent or even
impossible?
● Quantum physics in Transformers, eigenvalue and an inverted Mobius Strip in
creating a time machine in Avengers’ Endgame, etc.
How?
So 29791 x 328.2 =
9.777462.65.
So 763.4054 x 209.5=
159,933.473
Remember, R is the
average number of
people each person
with a disease goes
on to infect.
It is not the # of
infected people
walking around.
Stringency Index per Country
This is the strictest a
country has ever been
at one point.
● So far we’ve seen that COVID data is not stationary but we’ll do our best to make
predictions
● Will be using a few algorithms to predict cases in the U.S only.
● The best algorithm will be used to predict cases and deaths in New Jersey
Time Series Algorithms
● ARIMA Models
○ Auto Regressive (AR)
○ Moving Average (MA)
○ Autoregressive Integrated Moving Average (ARIMA)
○ Seasonal Autoregressive Moving Average (SARIMA)
● Holt Models
○ Holt Linear Model
○ Holt Winter
● Machine Learning Models
○ Linear Regression
○ Polynomial Regression
○ Support Vector Machine
ARIMA Models
● Auto Regressive (AR) Integrated (I) Moving Average (MA)
● AR Model
○ uses observations from previous time steps as input to predict the value at the next step
● MA Model
○ next observation is the mean of every past observation
● Auto ARIMA
○ Combination of the past two models but takes into account non seasonal differences needed for
stationarity.
○ Makes non stationary data stationary by removing trends
● SARIMA
○ Extension of the ARIMA
○ Adjusts a non-stationary time series by removing trend and seasonality.
● In Python, ‘from pmdarima import auto_arima’
Holt Models
● Exponential Smoothing
○ weighted averages of past observations, with the weights decaying exponentially as the
observations get older.
○ In other words, the more recent the observation, the higher the associated weight.
● Holt Linear Model
○ Builds upon simple exponential smoothing (SES), which is a method suitable for forecasting with
no clear trend or seasonal pattern.
○ Extends SES by allowing the forecasting of data with a trend
● Holt-Winters Model
○ Extends the linear model by capturing seasonality
● In Python, ‘from statsmodels.tsa.api import Holt, SimpleExpSmoothing,
ExponentialSmoothing”
Machine Learning Models
● Regression is a form of predictive modelling technique which investigates the
relationship between a dependent and independent variable
● Linear Regression
○ Uses a linear relationship to predict the average values of Y for a given value of X using a straight
line or the regression line.
● Polynomial Regression
○ Fits a polynomial line on data that is correlated but does not look linear
○ Reduces errors that would otherwise be produced by a linear regression line
● In python, both are in Sci-kit Learn
Machine Learning Models (continued)
● Support Vector Machine Model Regressor
○ SVM: tries to find a line/hyperplane) that separates classes. Then it classifies the new point
depending on whether it lies on the positive or negative side of the hyperplane
○ SVR uses the same principle as SVM except for regression
○ Acknowledges the presence of non-linearity in the data
○ Linear regression uses a regression line , SVR uses a hyperplane
○ Support vectors: data points on either side of the hyperplane that are closest to the hyperplane
■ used to plot the boundary line
○ SVR tries to fit the best line within a threshold value or the distance between hyperplane and
boundary line
■ Does not try to minimize the error between the real and predicted value, like regression
models
● In scikit learn
Forecasting U.S. Data
Before we proceed...
U.S. States data on cases and deaths used weekly average, or were aggregated using the
mean.
This was used for the time series prediction in both the U.S. and NJ.