BOX & JENKINS MODEL IDENTIFICATION: A COMPARISON OF METHODOLOGIES

 

Maria Augusta Soares Machado

IBMEC/RJ - Brazil

                                                                                                     E-mail: mmachado@ibmecrj.br

 

Reinaldo Castro Souza

Pontifícia Universidade Católica do Rio de Janeiro (PUC/RJ) - Brazil

E-mail: reinaldo@ele.puc-rio.br

 

Ricardo Tanscheit

Pontifícia Universidade Católica do Rio de Janeiro (PUC/RJ) - Brazil

E-mail: ricardo@ele.puc-rio.br

 

Submission: 19/10/2012

Accept: 10/11/2012

 

 

ABSTRACT

 

This paper focuses on a presentation of a comparison of a neuro-fuzzy back propagation network and Forecast automatic model Identification to identify automatically Box & Jenkins non seasonal models.

            Recently some combinations of neural networks and fuzzy logic technologies have being used to deal with uncertain and subjective problems. It is concluded on the basis of the obtained results that this type of approach is very powerful to be used.


Key-words: Neuro-Fuzzy Networks, Box & Jenkins Methodology, Fuzzy Logic

1       Introduction

Artificial neural network applications have shown that this technology has significant capabilities in pattern recognition. The abilities of feed forward back propagation artificial neural networks used together with fuzzy modeling that try to extract the model directly from the experts knowledge, seem to offer a good approach to the problems inherent in the Box & Jenkins ARIMA model identification.

            The literature in time series forecasting clearly indicates the properly applied the Box & Jenkins approach to time series forecasting yields forecasts that are superior to those resulting from other standard time series forecasting procedures. As a result, the method has received much attention however, the literature also indicates some reluctance to use this method in practice, due to the difficulties associated with model identification Vandaele(1983) states,” identification is the key to time series model building”. The task of forecaster is to use basic model identification tools.

2       Application

            The algorithm used to determine Box & Jenkins non-seasonal patterns was implemented in seven steps:

Step 1 - Generation of 400 random time series AR(1),MA(1),AR(2),MA(2) and ARMA(1,1) with 700 observations.

AR(1) model:

zt   = f1  zt-1   +  at    t=1,...,700;

 where:: f1  = model parameter ;  f1  ~ Uniform (-1,1) ;  at  ~ Normal (0,1) 

MA(1) model:

zt  =  at    - q1   at-1   t=1,...,700;

where:: q1  = model parameter ;  q1 ~ Uniform (-1,1) ;  at  ~ Normal (0,1) 

AR(2) model:

zt   = f1  zt-1   +   f2 zt-2   +  at        t=1,...,700;

 where: f1 , f2 = model parameters;  f1 , f2  ~ Uniform (-2,2) ;  at  ~ Normal (0,1) 

MA(2) model:

zt  =  at    - q1   at-1    -   q2  at-2     t=1,...700;

where: q1 , q2 = model parameters ;  q1 , q2  ~ Uniform (-2,2) ;  at  ~ Normal (0,1) 

ARMA(`1,1) model:

Zt   =  f1  zt-1  + at   -  q1 at-1          t=1,...,700  ;

where  f1 , q2 = model parameters ;  f1 , q2  ~ Uniform (-2,2);   at  ~ Normal (0,1) 

Step 2 - It was estimated ACF and PACF using the first 10 lags, for each model, which are the neuro-fuzzy inputs. For estimated ACF (model “ j “ ,j=1,...,400):

1(j), 2(j), 3(j), 4(j), 5(j), 6(j), 7(j), 8(j), 9(j), 10(j), where:

1(j)  ACF’s value of  “j” model for lag 1; 2(j) ACF’s value of  “j” model for lag 2; .9(j) ACF’s value of  j” model for lag 9; 10(j) ACF’s value of  “j “ model for lag 10;

For estimated ACF (model “ j “ ,j=1,...,400): 11(j), 22(j), 33(j), 44(j), 55(j),

66(j), 77(j), 88(j), 99(j), 1010(j), where:

11(j) PACF’s value of “j “ model for lag 1; 22(j) PACF’s value of “j “ model for lag 2;.

99(j) PACF’s value of “j “ model for lag 9; 1010(j) PACF’s value of “j “ model for lag 10;

Step 3 – Determination of pairs.

 (k(j) , kk(j)),             j=1,....,400 ; k=1, ..... ,10        as neural fuzzy networks inputs

Step 4Determination of neural fuzzy networks outputs.

The neural fuzzy networks “Black- Box” is shown next:

 

 

 

 

 


where:

a1(j)  - neuro-fuzzy output of model “j” for lag 1; a2(j) - neuro-fuzzy output of model “j” for lag 2;                    ..a9(j) - neuro-fuzzy output of model “j” for lag 9; a10(j) - neuro-fuzzy output of model “j” for lag 10;

Step 5  Determination of a pattern for each structure. The pattern of each structure is:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, where:

1 mean of neuro-fuzzy network for lag 1; 2 mean of neuro-fuzzy network for lag  2; ..9 mean of neuro-fuzzy network for lag  9; 10 mean of neuro-fuzzy network for lag 10;

Step 6 - Determination of weighted Euclidean distances using exponential smoothing

for  lag “ j

where:

b = 0.7  for  AR(1);b = 0.5 ; for  MA(1) ; b = 0.2  for   AR(2) ; b = 0.4  for   MA(2); b = 0.4  for   ARMA(1,1)

These values where determined based on the results of a detailed analysis of networks outputs.

Step 7The minimum of weighted Euclidean distances is indicated as the best model to fit the time series being studied.

AR(1)  pattern: [0.0191 0.1540 0.0397 0.1358 0.1194 0.1256 0.1220 0.1104 0.1141 0.1042]

MA(1)  pattern: [0.4362 0.4443 0.4571 0.4303 0.4517 0.4458 0.4377 0.4492 0.4588 0.4440]

AR(2)  pattern: [0.0353 0.0819 0.0749 0.0300 0.0270 0.0301 0.0260 0.0206 0.0256 0.0216]

MA(2)  pattern: [0.2840 0.3114 0.3160 0.3157 0.3159 0.3042 0.3015 0.2877 0.3062 0.2947]

ARMA(1,1)  pattern: [0.1196 0.3775 0.2944   0.3237   0.3394   0.3306   0.3148 0.3262 0.3243 0.3173]

 

 

 

 

3       Results

3.1 - Simulated random AR(1) models

The networks indications were:

Nº Observations

Correct Indication

Incorrect indication

 

 

AR (2)

ARMA (1,1)

50

92%

6%

2%

100

88%

6%

6%

200

94%

2%

4%

300

96%

2%

2%

Total percentage of right indication: 92,5 %

3.2 - Simulated random MA(1) models

The networks indications were:

Nº Observations

Correct Indication

Incorrect indication

 

 

MA (2)

AR (2)

ARMA (1,1)

50

56%

20%

12%

12%

100

48%

34%

12%

6%

200

48%

30%

12%

10%

300

58%

30%

6%

6%

Total percentage of right indication: 52,5 %

3.3 - Simulated random AR(2) models

The networks indications were:

No Observations

Correct indications

Incorrect indications

 

 

AR(1)

ARMA(1,1)

50

38%

62%

 

100

14%

74%

12%

200

14%

80%

6%

300

16%

72%

12%

Total percentage of right indication: 20,5 %

3.4 - Simulated random MA(2) models

The networks indications were:

Nº Observations

Correct Indication

Incorrect indication

 

 

MA (2)

AR (2)

ARMA (1,1)

50

34%

48%

14%

4%

100

34%

52%

12%

2%

200

32%

44%

16%

8%

300

34%

54%

8%

4%

Total percentage of right indication: 33,5 %

 

3.5 – Simulated random ARMA(1,1) models

The networks indications were:

No Observations

Correct indications

Incorrect indications

 

 

MA(1)

AR(1)

50

22%

2%

76%

100

5%

3%

84%

200

18%

2%

80%

300

8%

2%

90%

Total percentage of right indication: 14,5 %

3.6 - Comparison of Neuro-Fuzzy Networks Identification and Forecast automatic model Identification

For simulated time series of 50 observations:

Percentage of right indication

Neuro-Fuzzy Network

FORECAST-PRO

AR(1)

92

76

MA(1)

56

18

AR(2)

38

22

MA(2)

34

16

ARMA(1,1)

22

26

For simulated time series of 100 observations:

Percentage of right indication

Neuro-Fuzzy Network

FORECAST-PRO

AR(1)

88

53

MA(1)

48

31

AR(2)

14

18

MA(2)

34

25

ARMA(1,1)

5

11

For simulated time series of 200 observations:

Percentage of right indication

Neuro-Fuzzy Network

FORECAST-PRO

AR(1)

94

31

MA(1)

48

21

AR(2)

14

10

MA(2)

32

19

ARMA(1,1)

18

15

For simulated time series of 300 observations:

Percentage of right indication

Neuro-Fuzzy Network

FORECAST-PRO

AR(1)

96

33

MA(1)

58

41

AR(2)

16

10

MA(2)

34

15

ARMA(1,1)

8

13

A total of 200 random simulated time series from each structure was used to validate the methodology presented in this paper. The total average percentage of right neuro-fuzzy networks indications were:

Structure

Total average percentage of right Identification

AR(1)

98

MA(1)

77

AR(2)

67

MA(2)

78.5

ARMA(1,1)

59

4       Conclusions

            The neuro-fuzzy networks make good identification; when using them is recommended to consider their first indication as “over fitted “ . The second indication of their outputs must be considered as possible Box & Jenkins Model .

References

AZOFF, E. M. (1994) Neural Network Time Series Forecasting of Financial Markets. Chicester‑ John Wiley & Sons Ltd., Baffins Lane.

BRAGA, M.J.,BARRETO, J.M., MACHADO, M.A. (1995) Conceitos da Matemática  Nebulosa na  Análise  de Risco,  Artes  &  Rabiscus.

DHAR, V. STEIN, R. (1996) Raising Organizational IQ: Strategies for ‑‑‑‑KnowIedge Intensive Decision Support, Prentice Hall.

JANG, J.- S.R., Sun, C.T. (1997) Mizutani, E., Neuro-Fuzzy  and  Soft  Computing - A             Computational  Approach  to  Learning  and  Machine  Intelligence, Prentice Hall Inc., 1997.

LANGAR and ZADEH (Eds.). (1995) Industrial Applications of Fuzzy,Logic and Intelligent Systems, Piscata,Bay, NJ: IEEE Press.

REYNOLDS, B., STEVENS, MELLICHAMP, SMITH M.J. E. (1995) Box-JenkinsForecast  Model Identification, A.I. Expert  June 1995.

SCHWARTZ , The.Fuzzv Systems Come to Life in Japan,IEEE Expert, Vol. 5(1), p. 77‑78.           

SOUZA, C.R.., CAMARGO, M.E. (1996) Análise e Previsão de  Séries Temporais: os  Modelos  ARIMA, Sedigraf.

TSOUKALAS , L. H., UHRIG , R. E. (1997) Fuzzy and Neural Approaches in  Engineering, John Wiley & Sons INC.

VON A., C., Fuzzy Logic Applications Langar~ and Zadeh (Eds.), (1995) Industrial Applications of Fuzzy,Logic and Intelligent Systems, Piscata,Bay, NJ: IEEE Press.