LETTER TO EDITOR
Year : 2020  Volume
: 13  Issue : 11  Page : 521524
ARIMA models forecasting the SARSCOV2 in the Islamic Republic of Iran
Nayereh Esmaeilzadeh^{1}, Mohammadtaghi Shakeri^{2}, Mostafa Esmaeilzadeh^{3}, Vahid Rahmanian^{4}, ^{1} Department of Epidemiology, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran ^{2} Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran ^{3} Department of Mechanical Engineering, Mashhad Branch, Azad University, Mashhad, Iran ^{4} Zoonoses Research Center, Jahrom University of Medical Sciences, Jahrom, Iran
Correspondence Address:
Mohammadtaghi Shakeri Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad Iran
How to cite this article:
Esmaeilzadeh N, Shakeri M, Esmaeilzadeh M, Rahmanian V. ARIMA models forecasting the SARSCOV2 in the Islamic Republic of Iran.Asian Pac J Trop Med 2020;13:521524

How to cite this URL:
Esmaeilzadeh N, Shakeri M, Esmaeilzadeh M, Rahmanian V. ARIMA models forecasting the SARSCOV2 in the Islamic Republic of Iran. Asian Pac J Trop Med [serial online] 2020 [cited 2023 Feb 1 ];13:521524
Available from: https://www.apjtm.org/text.asp?2020/13/11/521/291407 
Full Text
Currently, the COVID19 epidemic has spread to more than 210 countries, with 3 272 202 confirmed cases and 230 104 deaths globally as of 3rd May 2020. Iran is found as the hotspot region of COVID19 in the Eastern Mediterranean with more than 93 thousands confirmed cases and 5 957 deaths until 30th April[1]. Suppression strategies especially case isolation, elective home quarantine, and other mitigation approaches such as the closure of educational centers and transmission control by the lockdown of social activities are applied to reduce the basic reproduction number to less than 1[2]. The strategies have achieved varying degrees of success in different countries[3]. The autoregressive integrated moving average (ARIMA) models were developed to determine the temporal patterns and shortterm prediction[4].
This approach is useful for forecasting and evaluation of confronting measures and a number of studies have confirmed it[5],[6],[7]. The results of this study can help to make an informative decision by the government and set proper policy to adopt interventions for this infectious disease.
The daily new laboratoryconfirmed, recovered and death due to COVID19 cases between 20th February 2020 and 30th April 2020 extracted from World Health Organization website[1].
Firstly, we developed the ARIMA model for each series. This model includes single regression, multiple regression, and the moving averages. It can remove the confounding effect of time. Therefore, the time series model ARIMA (p, d, q) consists of several components. The order of p, d, and q is explained as the autoregressive part of the model, the integrated part of the model, and the moving average parameter[8].
This linear combination is formulated as:
[INLINE:1]
Where y is a dependent variable (daily cases of COVID19), Ap is an autoregressive operator coefficient γq is the moving average operator coefficient, γtp is the value of the cases of COVID19 in an earlier time, εtq is the value of the cases of COVID19 deviation in q time, and et is a random error term with the whitenose distribution.
The assumption of this model is based on the stationary data, so we performed the Bartlett and the unitroot tests for determination stationery for variance and mean value of data and then transformed them if needed. For estimation of the number of autoregressive and moving average parameters, we used autocorrelation functions and partial autocorrelation functions correlograms, after which possible models were identified[9].
In the next step, we evaluated the goodnessoffit of the end model through checking white noise residuals with LjungBox (Q) test and the best model which was fitted to the data was selected based on least value of the Akaike Information Criterion (AIC)[5].
Then, the best ARIMA models were applied to the prediction of the events of COVID19, and the forecasting precision was estimated by the rootmeansquare error (RMSE). This is computed using the following formula:
[INLINE:2]
Where Yt of events is the observed number, ϒt is the forecast values at time t, and N is the number of events[10]. The statistical significance level was set at 0.05. Stata (ver.14) was used as the software for the statistical analysis.
The trend of the actual and predicted number of cases for each series of COVID19 including new cases, recovery, and death cases for 71 days from 20th February 2020 to 30th April 2020 is presented in [Figure 1]. Also, these graphs indicate the forecast numbers for the 14 days ahead as shown in see [Table 1].{Figure 1}{Table 1}
After the stationary tests, the square root transformation was used for the new cases, death cases, and recovery cases, and no one needed a regular difference. All statistical methods were performed on the transformed data. Autocorrelation functions and partial autocorrelation functions plots were drawn for each series of COVID19 cases. In these charts, the grey zone displays the 95% confidence interval and the lines that are continuously out of range is considered as significantly different [Figure 1].
The potential ARIMA models for the new cases of COVID 19 cases were ARIMA (1, 0, 0) and ARIMA (1, 0, 1), and for the recovery cases were ARIMA (1, 0, 1) and ARIMA (2, 0, 1). Finally, ARIMA (1, 0, 1) and ARIMA (1, 0, 0) were recruited for the death cases.
The goodnessoffit of the models was evaluated by using the LjungBox (Q) test and AIC. The ARIMA (1, 0, 0), ARIMA (1, 0, 1), and ARIMA (1, 0, 1) were selected for determining the new confirmed cases, the death cases, and the recovery cases as the best ARIMA models, respectively.
The results of the goodnessoffit of the models are presented in [Table 2]. Note that this is only for the bestfitted models. For the residuals of the selected models, it is shown that the data were completely modelled.{Table 2}
We found models based on the best models that fit for each series of COVID19 between 20th February 2020 and 30th April 2020 and then forecasted them for 14 days ahead [Figure 1]A and [Table 1]. Next, we compared the actual data of COVID19 events with the predicted cases. The predicted models are approximately in line with the real death and new confirmed cases, but the recovery cases are less precise than others as shown in [Figure 1]. The formula of models is as follows:
The equation of daily new laboratoryconfirmed cases is:
SQRT New casest= 22.06+0.98 SQRT New casest1 (3)
Eq. (3) indicates that an increment in the square number of new cases at this time leads to increase of 98% in square number for new cases, one day ahead (P<0.001), and the Wald test is significant. After an exponential increase in the middle of the epidemic period, the situation is converted as shown in [Figure 1]A. In a short time, we have predicted a declining trend in the occurrence of new cases.
The equation of the recovery cases is:
SQRT Recovery casest=22.20+0.98 SQRT Recovery casest10.50 εt1 (4)
Eq. (4) shows that the rising square number of recovery cases at this time results in a significant increase of 98% in the square number of recovery cases one day ahead, and a negative correlation with the deviation in one time ago (P<0.001). The Wald test is significant. This character is shown in [Figure 1]B, where we expect to see a somewhat decreased trend over time.
The equation of death cases is as follows:
SQRT Death casest=6.12+0.98 SQRT Death caset10.30εt1 (5)
Similarly, Eq. (5) shows that the increasing square number of death cases at this time leads to an increase in the square number of death cases one day ahead, and a negative correlation with the deviation in one time ago (P<0.001). The Wald test is significant. [Figure 1]C shows that after an exponential increase in the early stage of the epidemic period, the situation converted. In a short time, we predict a smoothly decline trend in the occurrence of death cases as shown in [Figure 1]C.
The forecasting in this study was based on the primary time series methods. This means that it is affected by the outlier data, not considering the unknown noise. Therefore, the models have better performance for the short term, but the findings should be explained with thriftiness[8],[9]. However, the application and interpretation of these models are simple and is an immediate tools for monitoring systems[6],[7].
The government of the Islamic Republic of Iran advised to close the educational centers and locked down activities and other confronting approaches from the earliest days of the outbreak on 24th February.
It is noted that Iranian people celebrate their own new year starting on 21th March 2020. They follow the calendar which is based on solar and is different from Christian’s calendar. In their new year, people visiting family and friends traditionally and this results in the growing number of contacts between people which eventually can increase the number of new cases and casualties with the spread of COVID19. It is anticipated that these patterns may be repeated during or after the Ramadan (the holy month in Islam) due to crowding people for praying in mosques and holy shrines. Therefore, the government should consider preventing measures to control the spread of the viruses under these conditions.
The predicted number of new confirmed, death, and recovery cases indicated somewhat is decreasing. The goodnessoffit criteria were suitable for these events. However, the confirmed cases can rise remarkably, unless necessary preventive measures are kept in place.
In conclusion, the proposed models in this work can act as a predictive tool for public health planning for better understanding of the dynamics of COVID19 in a resourceconstrained context with minimal data entry. Updating these data can be highly useful for an accurate predictions.
Conflict of interest statement
The authors declare that there is no conflict of interest.
Acknowledgements
This study was conducted using existing COVID19 data on the web official site of the World Health Organization and did not impose additional costs. The authors would like to thank for the support received from Mashhad University of Medical Sciences (Identified No: IR.MUMS.REC.1399.140).
Authors’ contributions
All authors contributed equally in conceptualizing the article, retrieving related literature and drafting the final manuscript.
References
1  WHO. Coronavirus 2020. [Online]. Available from: https://www.who. int/emergencies/diseases/novelcoronavirus2019/situationreports/ [Accessed on 2 May 2020]. 
2  Eubank S, Eckstrand I, Lewis B, Venkatramanan S, Marathe M, Barrett C. Commentary on Ferguson, et al. Impact of nonpharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. Bull Mathem Biol 2020; 82: 17. 
3  Anderson RM, Heesterbeek H, Klinkenberg D, Hollingsworth TD. How will countrybased mitigation measures influence the course of the COVID19 epidemic? Lancet 2020; 395(10228): 931934. 
4  Deb S, Majumdar M. A time series method to analyze incidence pattern and estimate reproduction number of COVID19. [Online]. Available from: https://arxiv.org/abs/2003.10655. [Accessed on 2 May 2020]. 
5  Benvenuto D, Giovanetti M, Vassallo L, Angeletti S, Ciccozzi M. Application of the ARIMA model on the COVID2019 epidemic dataset. Data Brief 2020; 29: 105340. 
6  Anne R. ARIMA modelling of predicting COVID19 infections. medRxiv. 2020. doi: doi.org/10.1101/2020.04.18.20070631. 
7  Ding G, Li X, Shen Y, Fan J. Brief analysis of the ARIMA model on the COVID19 in Italy. medRxiv 2020. doi: doi.org/10.1101/2020.04.08.2005 8636. 
8  Becketti S. Introduction to time series using Stata. TX: Stata Press College Station; 2013. 
9  Rahmanian V, Bokaie S, Rahmanian K, Hosseini S, Firouzeh AT. Analysis of temporal trends of human brucellosis between 2013 and 2018 in Yazd Province, Iran to predict future trends in incidence: A timeseries study using ARIMA model. Asian Pac J Trop Med 2020: 13(6): 272277 
10  Esmaeilzadeh N, Bahonar A, Foroushani AR, Nasehi M, Shakeri MT. Which type of univariate forecasting methods is appropriate for prediction of tuberculosis cases in Razavi Khorasan Province? A need for surveillance and biosurveillance systems. J Archiv Milit Med 2019; 7(3): e96229. 
