|LETTER TO EDITOR
|Year : 2020 | Volume
| Issue : 11 | Page : 521-524
ARIMA models forecasting the SARS-COV-2 in the Islamic Republic of Iran
Nayereh Esmaeilzadeh1, Mohammadtaghi Shakeri2, Mostafa Esmaeilzadeh3, Vahid Rahmanian4
1 Department of Epidemiology, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran
2 Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran
3 Department of Mechanical Engineering, Mashhad Branch, Azad University, Mashhad, Iran
4 Zoonoses Research Center, Jahrom University of Medical Sciences, Jahrom, Iran
|Date of Submission||03-May-2020|
|Date of Decision||09-Jun-2020|
|Date of Acceptance||15-Jun-2020|
|Date of Web Publication||14-Aug-2020|
Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad
Source of Support: None, Conflict of Interest: None
|How to cite this article:|
Esmaeilzadeh N, Shakeri M, Esmaeilzadeh M, Rahmanian V. ARIMA models forecasting the SARS-COV-2 in the Islamic Republic of Iran. Asian Pac J Trop Med 2020;13:521-4
|How to cite this URL:|
Esmaeilzadeh N, Shakeri M, Esmaeilzadeh M, Rahmanian V. ARIMA models forecasting the SARS-COV-2 in the Islamic Republic of Iran. Asian Pac J Trop Med [serial online] 2020 [cited 2020 Oct 22];13:521-4. Available from: https://www.apjtm.org/text.asp?2020/13/11/521/291407
Currently, the COVID-19 epidemic has spread to more than 210 countries, with 3 272 202 confirmed cases and 230 104 deaths globally as of 3rd May 2020. Iran is found as the hotspot region of COVID-19 in the Eastern Mediterranean with more than 93 thousands confirmed cases and 5 957 deaths until 30th April. Suppression strategies especially case isolation, elective home quarantine, and other mitigation approaches such as the closure of educational centers and transmission control by the lockdown of social activities are applied to reduce the basic reproduction number to less than 1. The strategies have achieved varying degrees of success in different countries. The autoregressive integrated moving average (ARIMA) models were developed to determine the temporal patterns and short-term prediction.
This approach is useful for forecasting and evaluation of confronting measures and a number of studies have confirmed it,,. The results of this study can help to make an informative decision by the government and set proper policy to adopt interventions for this infectious disease.
The daily new laboratory-confirmed, recovered and death due to COVID-19 cases between 20th February 2020 and 30th April 2020 extracted from World Health Organization website.
Firstly, we developed the ARIMA model for each series. This model includes single regression, multiple regression, and the moving averages. It can remove the confounding effect of time. Therefore, the time series model ARIMA (p, d, q) consists of several components. The order of p, d, and q is explained as the auto-regressive part of the model, the integrated part of the model, and the moving average parameter.
This linear combination is formulated as:
Where y is a dependent variable (daily cases of COVID19), Ap is an autoregressive operator coefficient γq is the moving average operator coefficient, γt-p is the value of the cases of COVID-19 in an earlier time, εt-q is the value of the cases of COVID-19 deviation in q time, and et is a random error term with the white-nose distribution.
The assumption of this model is based on the stationary data, so we performed the Bartlett and the unit-root tests for determination stationery for variance and mean value of data and then transformed them if needed. For estimation of the number of autoregressive and moving average parameters, we used autocorrelation functions and partial autocorrelation functions correlograms, after which possible models were identified.
In the next step, we evaluated the goodness-of-fit of the end model through checking white noise residuals with Ljung-Box (Q) test and the best- model which was fitted to the data was selected based on least value of the Akaike Information Criterion (AIC).
Then, the best ARIMA models were applied to the prediction of the events of COVID-19, and the forecasting precision was estimated by the root-mean-square error (RMSE). This is computed using the following formula:
Where Yt of events is the observed number, ϒt is the forecast values at time t, and N is the number of events. The statistical significance level was set at 0.05. Stata (ver.14) was used as the software for the statistical analysis.
The trend of the actual and predicted number of cases for each series of COVID-19 including new cases, recovery, and death cases for 71 days from 20th February 2020 to 30th April 2020 is presented in [Figure 1]. Also, these graphs indicate the forecast numbers for the 14 days ahead as shown in see [Table 1].
|Figure 1: Autocorrelation and partial autocorrelation functions plots and daily observed numbers of series of COVID-19, fitted values (20th February to 30th April) and 1-step ahead predicted values (14 days ahead).|
Click here to view
|Table 1: The forecast values (95% CI) according to fitted models of COVID-19 for the period from 1st May to 14th May 2020.|
Click here to view
After the stationary tests, the square root transformation was used for the new cases, death cases, and recovery cases, and no one needed a regular difference. All statistical methods were performed on the transformed data. Autocorrelation functions and partial autocorrelation functions plots were drawn for each series of COVID-19 cases. In these charts, the grey zone displays the 95% confidence interval and the lines that are continuously out of range is considered as significantly different [Figure 1].
The potential ARIMA models for the new cases of COVID- 19 cases were ARIMA (1, 0, 0) and ARIMA (1, 0, 1), and for the recovery cases were ARIMA (1, 0, 1) and ARIMA (2, 0, 1). Finally, ARIMA (1, 0, 1) and ARIMA (1, 0, 0) were recruited for the death cases.
The goodness-of-fit of the models was evaluated by using the Ljung-Box (Q) test and AIC. The ARIMA (1, 0, 0), ARIMA (1, 0, 1), and ARIMA (1, 0, 1) were selected for determining the new confirmed cases, the death cases, and the recovery cases as the best ARIMA models, respectively.
The results of the goodness-of-fit of the models are presented in [Table 2]. Note that this is only for the best-fitted models. For the residuals of the selected models, it is shown that the data were completely modelled.
|Table 2: Characteristics of the best ARIMA fitted models series of COVID- 19 from 20th February to 30th April.|
Click here to view
We found models based on the best models that fit for each series of COVID-19 between 20th February 2020 and 30th April 2020 and then forecasted them for 14 days ahead [Figure 1]A and [Table 1]. Next, we compared the actual data of COVID-19 events with the predicted cases. The predicted models are approximately in line with the real death and new confirmed cases, but the recovery cases are less precise than others as shown in [Figure 1]. The formula of models is as follows:
The equation of daily new laboratory-confirmed cases is:
SQRT New casest= 22.06+0.98 SQRT New casest-1 (3)
Eq. (3) indicates that an increment in the square number of new cases at this time leads to increase of 98% in square number for new cases, one day ahead (P<0.001), and the Wald test is significant. After an exponential increase in the middle of the epidemic period, the situation is converted as shown in [Figure 1]A. In a short time, we have predicted a declining trend in the occurrence of new cases.
The equation of the recovery cases is:
SQRT Recovery casest=22.20+0.98 SQRT Recovery casest-1-0.50 εt-1 (4)
Eq. (4) shows that the rising square number of recovery cases at this time results in a significant increase of 98% in the square number of recovery cases one day ahead, and a negative correlation with the deviation in one time ago (P<0.001). The Wald test is significant. This character is shown in [Figure 1]B, where we expect to see a somewhat decreased trend over time.
The equation of death cases is as follows:
SQRT Death casest=6.12+0.98 SQRT Death caset-1-0.30εt-1 (5)
Similarly, Eq. (5) shows that the increasing square number of death cases at this time leads to an increase in the square number of death cases one day ahead, and a negative correlation with the deviation in one time ago (P<0.001). The Wald test is significant. [Figure 1]C shows that after an exponential increase in the early stage of the epidemic period, the situation converted. In a short time, we predict a smoothly decline trend in the occurrence of death cases as shown in [Figure 1]C.
The forecasting in this study was based on the primary time series methods. This means that it is affected by the outlier data, not considering the unknown noise. Therefore, the models have better performance for the short term, but the findings should be explained with thriftiness,. However, the application and interpretation of these models are simple and is an immediate tools for monitoring systems,.
The government of the Islamic Republic of Iran advised to close the educational centers and locked down activities and other confronting approaches from the earliest days of the outbreak on 24th February.
It is noted that Iranian people celebrate their own new year starting on 21th March 2020. They follow the calendar which is based on solar and is different from Christian’s calendar. In their new year, people visiting family and friends traditionally and this results in the growing number of contacts between people which eventually can increase the number of new cases and casualties with the spread of COVID-19. It is anticipated that these patterns may be repeated during or after the Ramadan (the holy month in Islam) due to crowding people for praying in mosques and holy shrines. Therefore, the government should consider preventing measures to control the spread of the viruses under these conditions.
The predicted number of new confirmed, death, and recovery cases indicated somewhat is decreasing. The goodness-of-fit criteria were suitable for these events. However, the confirmed cases can rise remarkably, unless necessary preventive measures are kept in place.
In conclusion, the proposed models in this work can act as a predictive tool for public health planning for better understanding of the dynamics of COVID-19 in a resource-constrained context with minimal data entry. Updating these data can be highly useful for an accurate predictions.
Conflict of interest statement
The authors declare that there is no conflict of interest.
This study was conducted using existing COVID-19 data on the web official site of the World Health Organization and did not impose additional costs. The authors would like to thank for the support received from Mashhad University of Medical Sciences (Identified No: IR.MUMS.REC.1399.140).
All authors contributed equally in conceptualizing the article, retrieving related literature and drafting the final manuscript.
| References|| |
Eubank S, Eckstrand I, Lewis B, Venkatramanan S, Marathe M, Barrett C. Commentary on Ferguson, et al. Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand. Bull Mathem Biol
Anderson RM, Heesterbeek H, Klinkenberg D, Hollingsworth TD. How will country-based mitigation measures influence the course of the COVID-19 epidemic? Lancet
Deb S, Majumdar M. A time series method to analyze incidence pattern and estimate reproduction number of COVID-19. [Online]. Available from: https://arxiv.org/abs/2003.10655
. [Accessed on 2 May 2020].
Benvenuto D, Giovanetti M, Vassallo L, Angeletti S, Ciccozzi M. Application of the ARIMA model on the COVID-2019 epidemic dataset. Data Brief
Anne R. ARIMA modelling of predicting COVID-19 infections. medRxiv. 2020. doi: doi.org/10.1101/2020.04.18.20070631.
Ding G, Li X, Shen Y, Fan J. Brief analysis of the ARIMA model on the COVID-19 in Italy. medRxiv 2020. doi: doi.org/10.1101/2020.04.08.2005 8636.
Becketti S. Introduction to time series using Stata
. TX: Stata Press College Station; 2013.
Rahmanian V, Bokaie S, Rahmanian K, Hosseini S, Firouzeh AT. Analysis of temporal trends of human brucellosis between 2013 and 2018 in Yazd Province, Iran to predict future trends in incidence: A time-series study using ARIMA model. Asian Pac J Trop Med
Esmaeilzadeh N, Bahonar A, Foroushani AR, Nasehi M, Shakeri MT. Which type of univariate forecasting methods is appropriate for prediction of tuberculosis cases in Razavi Khorasan Province? A need for surveillance and biosurveillance systems. J Archiv Milit Med
[Table 1], [Table 2]