ESTIMATES BY BOOTSTRAP INTERVAL FOR TIME SERIES FORECASTS OBTAINED BY THETA MODEL

In this work, are developed an experimental computer program in Matlab language version 7.1 from the univariate method for time series forecasting called Theta, and implementation of resampling technique known as computer intensive "bootstrap" to estimate the prediction for the point forecast obtained by this method by confidence interval. To solve this problem built up an algorithm that uses Monte Carlo simulation to obtain the interval estimation for forecasts. The Theta model presented in this work was very efficient in M3 Makridakis competition, where tested 3003 series. It is based on the concept of modifying the local curvature of the time series obtained by a coefficient theta (Θ). In its simplest approach the time series is decomposed into two lines theta representing terms of long term and short term. The prediction is made by combining the forecast obtained by fitting lines obtained with the theta decomposition. The results of Mape's error obtained for the estimates confirm the favorable results to the method of M3 competition being a good alternative for time series forecast.


INTRODUCTION
The forecasting models are of great importance in the academic and social media because of its wide applicability in various areas of scientific, industrial, commercial and services.These predictions, which are made by such companies to estimate the demand for its products, and consequently, plan the production schedule, shopping and other activities including identifying when and where to focus marketing efforts (LIEBEL, 2004).
An interesting illustration is the hydroelectric construction plant in which knowledge of the time series of river flow that supply the dam is essential to the development of the project.Currently, industries need to plan in detail the production and stocks kept at the disposal of operations.Thus, the time series application is essential.
The techniques of time series forecasting are the predictions from sequences of past values, in other words, from observations of the series [Zt-1, Zt-2, Zt-3, .... , Z1].
The objectives of the techniques of time series forecasting are: • forecasting future values of the time series; • describing only the behavior of the series (checking of the trend and seasonality); • identifying the mechanism that generates the series (the stochastic process that generated the series).
The modeling of the series is based on a single realization and this requires it to have an ergodic stochastic process.The predictions resulting from the application of these techniques may be related only to the information contained in the historical series of interest (based on classical statistical methods) or even in addition to incorporating this information, may consider other supposedly relevant and which are not contained in the series analyzed (methods based on Bayesian statistics).
According to Boucher and Elsayed (1994), the techniques of time series forecasts can be divided into two categories: qualitative and quantitative techniques.
The quantitative technique is the most used when there is a set of data, and it is possible to apply various statistical methods in order to extrapolating these data and to obtaining probable future values.Assimakopoulos and Nikolopoulos (2000) have proposed a new univariate forecasting model called Theta, which is relatively simple to apply and presented one of the best performances in time series forecasts.This was one of the methods tested in the M3-Competition of Makridakis (2000).The model consists in decomposing time series in two lines called Theta, each line is extrapolated separately by linear regression and Simple Exponential Smoothing (SES), respectively, then the two forecasts are combined with equal weights, obtaining the Theta forecast.
However, as disadvantage it may be cited the lack of confidence intervals for the estimates in the work of Assimakopoulos and Nikolopoulos (2000).The confidence intervals are very important to have a reliable estimate of the size of the error you can make to get an estimate.
The purpose of this work is applying the technique for computing-intensive resampling known as "Bootstrap" to estimate by confidence intervals the point forecasts achieved by Theta Model.The Bootstrap resampling method consists in a set of data, either directly or through an adjusted model in order to create the data replication, which can evaluate the variability of amounts of interest without using analytical calculations.
Therefore, this technique is particularly useful when the estimators calculation is complicated by analytical methods.Resampling permits different alternatives to meet standard deviations and confidence intervals by analyzing a set of data (DAVISON; HINKLEY, 1997).

Theta Model
According to Nikolopoulos et. al. (2011) : 10.14807/ijmp.v8i1.480unable to capture efficiently all available information hidden in a time series.The Theta model sparked interest in academia, due to its amazing performance in positive predictions in M3-competition (MAKRIDAKIS, et. al, 2000).This model can be understood according to the analysis of Hyndman and Billah (2003) as being equivalent to simple exponential smoothing with "drift".

DOI
However, Nikolopoulos and Assimakopoulos (2005) disagree with this approach and claim that the theta model is more general than the simple exponential smoothing because it is an approximation to the decomposition of the data and that it can be relied on extrapolation of any forecasting method.
The Theta model is based on modification of the local curvature of time series seasonally adjusted by a coefficient theta ( ) Θ .This coefficient is applied directly to the second difference of the series (ASSIMAKOPOULOS; NIKOLOPOULOS, 2008).This application results in a series called "Theta Line", maintaining the average and the slope of the original data, but not their curvatures.
The general formulation of theta model is based on the following steps: • Decomposition of the initial series in two or more rows theta; • Each theta line is extrapolated separately and the forecasts are simply combined with equal weights.
The best formulation of the model and which was tested in the Competition-M3 is the decomposition of the time series in two theta lines.In this case the number of observations is decomposed as follows: Where, ( ) is the linear regression of the data and ( ) by the following expression: The ( ) describes the series as a linear trend.( ) duplicates the local curvatures extending the short-term action.For extrapolation of ( )

INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION (IJM&P)
In practice the method can be easily implemented by using EXCEL an electronic spreadsheet.Nikolopoulos and Assimakopoulos (2005) suggest the following steps for its implementation: • Step 0: Seasonal decomposition of data by the classical method multiplicative if necessary; • Step 1: Apply Linear Regression to data and prepare ( ) and forecasts; • Step 2: Prepare values for ( ) using formula (1); with either SES (Simple Exponential Smoothing) or other smoothing method, such as moving averages; • Step 4: Combine with equal weights the forecasts from SES and LR (Linear Regression).
Theta model is usually simple and requires no extensive training.According to the results of competition of Makridakis (2000) the method has obtained good predictions in monthly series stationary or with trend or seasonality.Petropoulos & Nikolopoulos (2013) argue for the use of more theta lines Θ ∈ {-1, 0, 1, 2, 3}, as to extract even more information from the data.These lines can be extrapolated with other exponential smoothing methods like the Holt exponential smoothing and Brown exponential smoothing.

"Bootstrap" Method
"Bootstrap" is a computer-intensive method developed by Bradley Efron (1979) to be used in the estimation of the variability of statistics.Generally speaking, "bootstrap" is a technique that objectives the estimation by point or confidence interval of parameters of interest using resampling of the original data.It should be used when the classical methods for this purpose are asymptotic, difficult to implement or simply not existing for specific statistics.

Forecasts Theta
Given the time series t Y , the series fits a linear regression model by the method of ordinary least squares (OLS), obtaining the estimation of β ˆ, and the vector t Y ˆ that will be designated as To achieve other lines theta, it replaces in the equation: ) is extrapolated by a simple exponential smoothing (SES).The combination with equal weights in the period h gives the final forecast for the theta model. ) (5)

Confidence Interval "Bootstrap"
In the present study residues are obtained by the sample t Y ˆ, obtained by combining equal weight of With linear regression model applied, the same is used to generate obtained by the equation: To generate the preview line t Y ˆ, a combination is made with equal weights between , and after the application of simple exponential smoothing (SES) on For the purpose of "Bootstrap", will be used residues obtained from this combination, so it follows that: where: t Y Original series, generated by the simulator t Y ˆ Series estimated by Theta; The steps for computational implementing to obtain confidence intervals for the forecast h periods are: 1) Fit a regression model by ordinary least squares (OLS), obtaining the estimation of β ˆ, and the vector of answers

INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION (IJM&P)
obtaining the vector of estimated residues ] ,..., , , [ ˆn ' which shall be considered the original sample for the purpose of "bootstrap".
3) Select B random sample of size n, from residues ε ˆ obtained in step (1) using resampling with replacement, with probability n 1 for each residue selected ∀ 4) Generate the pseudo-series, with each sample "bootstrap" by the equation: 5) Adjust again the model by ordinary least squares to pseudo-series, obtaining the estimated "bootstrap" from The following flowchart, Figure 1 shows steps of the algorithm. .To obtain a percentile confidence interval with confidence level of 95% to

INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION (IJM&P)
, it is ordered in ascending order of the data distribution "bootstrap" And use non-parametric estimator

NUMERICAL RESULTS
Table 1, below, shows the series generated by the simulator GST with 36 observations.The last six values of the series were stored for validation and performance testing.Theta Model is decomposed into two theta lines, L (Θ = 0) and L (Θ = 2) and extrapolated by linear trend and simple exponential smoothing, respectively.
, and forecast for the Theta Model, the evaluation of quality of the prediction according to the criterion MAPE, RMSE, and MSE.Table 3, below shows the forecast results obtained by the theta and also the predictions obtained by the Statgraphics software, using traditional methods optimized for the lowest RMSE and their respective MAPE errors.Table 3, allows us to affirm that the Theta model in its simplest application L(Θ = 0) and L(Θ = 2), using the method of simple exponential smoothing (SES) for extrapolation of L(Θ = 2), obtained the best performance getting an average absolute percentage error of 0.2963% against 0.4095% achieved by the methodology of Box and Jenkins.Analyzing Table 4, it is note that the MSE "bootstrap" on the predictions obtained appear in ascending order, meaning that each forecast horizon the confidence limits appear in a larger range.Making the predictions less reliable as the forecast horizon increases.
The interval "bootstrap" for the forecast horizon of h periods ahead for the time series, it was applied the method "bootstrap" with a number of replications B = According toChaves Neto (1991), 'Bootstrap' is a non-parametric statistical technique computationally intensive that allows evaluating the variability of statistics based on data from a single sample exists.The basic idea of "Bootstrap" is resampling a set of observations of the original sample, directly or via an adjusted model in order to create replicas of data, from which it can evaluate the variability of statistics without the use of analytical methods.So when you have a random sample of size n, x' = [x1, x2, x3, ..... , xn], with replacement it becomes NBS samples with replacement from the original sample resulting in a sample called "bootstrap" and denoted by x *.Calculating the T statistic of interest with the NBS samples "bootstrap" it can be gotten the set of estimates "bootstrap" consisting of * i T i = 1, 2, ... NBS.These values create an approximation of the true distribution sample of T.Resampling is based on an empirical distribution, in other words, there is probability mass equal to 1/n each sample point.Thus the empirical distribution placed in the sample data is F = 1/n.The key point of the method is thus the replacement that allows the reset of as many samples as desired.The goal is to see how the statistics obtained from the resampling obtained vary due to random sampling.In cases of parameter estimation in which the sampling distribution of the statistic (estimator) is unknown the "bootstrap" is very helpful.Hestemberg et al. (2003)  stated that the original sample represents the population from which it was removed.The resampling represents what you should get when many samples are taken from the original population.The "bootstrap" distribution of statistics, based on many resampling, represents an approximation of the true sampling distribution of statistics.In order to obtain reliable results it should be taken thousands of "bootstrap" samples from the original sample.INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION (IJM&P)http://www.ijmp.jor.brv. 8, n. 1, January -March 2017  ISSN: 2236-269X   DOI: 10.14807/ijmp.v8i1.480    3. PROPOSED METHOLOGYThe time series analyzed was obtained by a program generator of time series (GST), developed experimentally in the Pascal language.It has been Chosen to generate series with 36 observations and the generation was made to the structure models AR the constant term δ = 45 and noise variance has been set at V(at) = 2 a σ = 0,2.The programs were developed in Matlab version 7.1 to achieve the objectives of this work.
problem considered is one which uses linear models to estimate the sampling distribution of statistics β ˆ used to estimate regression model adapted to the context prediction for linear regression is: model of order nxp β ˆ: parameter vector with dimension p; ε ˆ: residual vector with dimension n.
1000.The histogram of the estimates "bootstrap" of the forecast for the first prediction ) h ( Y ˆ* t , Figure 2 below shows the distribution of "bootstrap" for ) ( Y ˆ* t 1 .In the figure it is observed a high degree of symmetry, which suggests a Gaussian model for these values gotten.

the Theta model is a time series INDEPENDENT JOURNAL OF MANAGEMENT & PRODUCTION (IJM&P)
Bootstrap", as already mentioned, is a computationally intensive method that uses Monte Carlo simulation to estimate standard errors and confidence intervals.

Table 1 :
Time series generated by simulator

Table 2 ,
below, shows the six series values stored for performance testing,

Table 2 :
Period actual value, regression lines and smoothing, theta forecasts, MAPE, and RMSE MSE RMSE for the series decomposed into two lines and the extrapolation by SES and LR respectively, is equal to 0.1606.The performance measure medium according to the criterion MAPE is 0.2963%.Figure1, below, shows the time series analyzed, the linear regression line of exponential smoothing and the forecasts for the theta method.

Table 3 :
Period, theta forecasts and Box & Jenkins and MAPE's