Statistical Modeling of Quarterly Record of Rainfall Distribution in South West Nigeria
Iyabode Favour Oyenuga1, *, Benjamin Agboola Oyejola2, Johnson Taiwo Olajide1
1Department of Mathematics and Statistics, the Polytechnic, Ibadan, Nigeria
2Department of Statistics, University of Ilorin, Kwara State, Nigeria
To cite this article:
Iyabode Favour Oyenuga,Benjamin Agboola Oyejola, Johnson Taiwo Olajide. Statistical Modeling of Quarterly Record of Rainfall Distribution in South West Nigeria. Science Journal of Applied Mathematics and Statistics. Vol. 4, No. 2, 2016, pp. 52-58. doi: 10.11648/j.sjams.20160402.16
Received: February 26, 2016; Accepted: March 10, 2016; Published: March 29, 2016
Abstract: Modelling distribution of rainfall in South West Nigeria is examined. Quarterly rainfall distribution data of Ibadan as a case study were collected for a period of 43 years (1971-2013) from Nigeria Meteorological Agency (NMA) quoted in Central Bank of Nigeria (CBN) bulletin. The time series analysis was used to model and forecast the quarterly rainfall. The time plot shows there is a seasonal cycles in the series, we used Akaike Information Criterion to detect auto-regressive (AR), moving average (MA) and auto-regressive moving average (ARMA) models of the best order. It was shown that AR(4), MA(4) and ARMA(4,4) have the least Akaike Information Criterion (AIC). These models were then used to forecast for the quarterly rainfall for five years.
Keywords: Akaike Information Criterion, AR Model, MA Model, ARMA Model, Forecasting, Rainfall
The amount of seasonal distribution and type of rainfall as well as the length of the wet season at a place depends largely on its locations, with respect to the fluctuating inter- tropical discontinuity and its associated weather zones. Over the years there has been considerable increase in rainfall records and statistical methods have been used to extend the available data and predict the likely trend and periodicity of natural events. About thirty years is required to obtain a reliable indication of the real long term mean.  studied the secular trends and variations in rainfall of Indian stations and concluded that neither the annual nor the seasonal monsoon rainfall showed any general tendency for increase or decrease at any of the stations. In some areas where variability is particularly strong even much longer periods are necessary . According to , two rainfall series for Sahel region of West Africa have been updated to 1983, annual and monthly series were presented and analysed. Relative dry conditions have been persisted in this region since 1968 due mainly to a decline in rainfall during August, the wettest month. Drought has probably been less severe in this region during the late rather than the early 1970’s because of better rainfall early in the season.  report significant periodicities for some African regions, notably Sahel and South –Africa, for the data up to the mid 70’s, since then some regions (e.g West Africa) have suffered drastic changes like prolonged droughts, they therefore, conducted a power spectrum analysis for the series reported by Nicholson.  conducted a comparison of six rainfall –runoff modeling approaches to stimulate daily, monthly and annual flows in eight unregulated catchment. They concluded that time series can provide adequately estimates of monthly and annual yields in the water resources of the catchments.  employed an intervention model for average 10days stream flow forecast and synthesis which was investigated by to deal with the extraordinary phenomena caused by typhoons and other serious abnormalities of the weather of the Tanshui River basin in Taiwan. Time series analysis was used by  to detect changes in the components of a number of rainfall times.  studied analysis of temporal rainfall in Isreal.  studied the time series of standardized (normalized) deviations of annual mean rainfall in Ghana for the period 1961-1998, the rainfall trend was plotted using data for 30stations, to check for periodicity, the time series was subjected to a power spectrum analysis using the Maximum Entropy Spectral Analysis (MESA) techniques.
 in this study monthly and annual total precipitation records of the lberian peninsula were analysed for trend in order to take into account seasonality and serial correlation, the different months were considered separately. The precipitation series were observed at forty meteorological stations scattered all over the lberian peninsula. The wald and wolfowits (1943) was applied to the series in order to detect serial correlation and Mann kendall test was also applied to detect possible trends in the series. No significant global trend was found in annual total precipitation series. The only trend observed was a downward trend in twenty one of the forty series of monthly total precipitation only for the month of March.
 examined the rain day series in Nigeria for periodicities fluctuations and trends for four regions arranged from South to North (the coastal, guinea savanna, midland and sahelian zones) based on the data collected for the period of 1919-1985. A significant decline was found on annual rain days for each zone of the country from 1939-1985 and also during the recent period, 1968-1985. This decline was found to begin in July in Sahel, thereafter extending to the midland in August and still further southwards to embrace the guinea-savannah zone by September and October.
Ibadan, the capital of Oyo state and is the largest city in West Africa South of the Sahara. it is situated in the South Western part of Nigeria, according to  It lies between latitude 70 541 North of equator and longitude 30 541 of the Greenwich Meridian. The city is elevated at about 234 metres above the sea level and it is situated on gently rolling hills running in a northwest and southwest direction. It has an estimated population of about 5million in 2006 census and a total land area of about 3123km2. It experiences tropical humid condition and has two distinct seasons, the wet season (March-October) which is controlled by Tropical Maritime Air mass from Atlantic Ocean and dry season (November-March) which is controlled by Tropical Continental air mass from Sahara desert.
 defines climate change as a significant and lasting change in the statistical distribution of weather patterns over periods ranging from decades to millions of years; noting that change in average weather conditions and change in climatic conditions of any geographic area may be caused by factors such as biotic processes, variations in radiation received by earth, plate tectonics and volcanic eruptions.
 examined rainfall seasonality in Niger Delta region of Nigeria, using both monthly and annual rainfall data from 1931 to 1997. The cumulative index analysis and the percentage of mean were employed for the study. The result indicates a wet season with over 95% of the total annual rainfall in the area. It showed a long wet season from February/March to November and a short dry season from December to January/February. It is noted that variation of rainfall in the locality could probably be as a result of rainfall determinant factors different from inter tropical discontinuity.  fitted various probabibility distribution models to various rainfall and runoff for the Tagwai dam in Minna, Niger state, Nigeria to evaluate the model that was best suitable for the prediction of their values and subsequently using the best model to predict for both the expected yearly maximum daily-rainfall and yearly maximum daily-runoff at some specific return periods. The normal distribution model was found most appropriate for the prediction of yearly maximum daily-rainfall and the log-Gumbel distribution model was the most appropriate for the prediction of yearly maximum daily runoff.
 used Akaike Information criterion (AIC) to model the annual climate change and its attendant effect using relative humidity, the results of their analysis show that AR(2) gives the best order that fits the model.  used Box-Jenkins approach to model quarterly rainfall in Nigeria using Edo state for a period of 38years as a case study and test for their models show that ARIMA (1,0,1) (0,1,1) was chosen for the series since it has the least RMSE.
The study make use of the quarterly rainfall distribution in Ibadan for the period of forty three years (1971 to 2013) obtained from Nigeria Meteorological Agency (NIMA) quoted in Central Bank of Nigeria (CBN) bulletin. An Information Criterion (Akaike) approach was used to determine the best model order of AR, MA and. ARMA. These models were used to forecast the quarterly amount of rainfall for a period of five years.
2.1. The Purely Random Process
This is a sequence of uncorrelated, identically distributed random variables with zero mean and constant variance and it is the simplest type of model used as a building block in many other models. This process is stationary and it is also known as White noise, the error process or innovation process.
The term white noise arises from the fact that a frequency analysis of the model shows that in analogy with white light all frequencies enter equally. It is denoted by .
A stochastic process is said to white noise process; if it is drawn from a fixed distribution usually assumed normal, if the following conditions hold:
a. for all values of t
c. = 0 for all values of t ≠ s
2.2. The Random Walk Process
A random walk is defined as a process where the current value of a variable is composed of the past value plus an error term defined as a white noise. The random walk model is given by:
It can be shown that the mean of a random walk process is constant but its variance is not. Therefore, a random walk process is non-stationary and its variance increases linearly with time, t. In practice, the presence of a random walk process makes the forecast process very simple since all the future values of Xt+s for s > 0, is simply Xt.
2.3. The General Linear Process
A time series Xt is said to follow a general linear process if it satisfies the difference equation defined by
2.4. Autoregressive Process (AR)
A time series Xt is said to follow an autoregressive process of order P i.e AR(P), if it is a weighted sum of the past P values plus a random shock so that,
The value at time t depends linearly on the last p values and the model looks like a regression model hence, the term autoregression. Using the backward shift operator B such that the AR(P) model may be re-written as
The first order AR process, AR(1) is
If then the model in equation (2) reduces to a random walk as in equation (1) when the model is non stationary. With then the series becomes explosives, hence non-stationary. However, if it can be shown that the process is stationary with ACF given by for k = 0, 1, 2, … thus the ACF decreases exponentially.
2.5. Moving Average Process (MA)
A process is said to be a moving average process of order q, MA (q) if it is a weighted sum of the last random shocks i.e
Using the backward shift operator B, it may be written as
A finite order MA process is stationary for all parameters values. However, it is customary to impose a condition on the parameter values of an MA model, known as invertibility condition to ensure that there is a unique MA model for a given ACF.
Suppose are independent purely random process and that then two MA process defined by
and have exactly the same ACF. Thus, the polynomial is not uniquely determined by the ACF. As a consequence given a sample ACF, it is not possible to estimate a unique MA process from a given set of data without putting some constraint on what is allowed.
To resolve this ambiguity, it is usually required that the polynomial has all its roots outside the unit circle.
Equation (3) can be re-written as
for some constraints such that
i.e we can invert the function taking the sequence to the b sequence and recover from present and past values of by a convergent sum.
Note that AR, MA and ARMA processes are used on stationary time series.
2.6. Autoregressive Moving Average Processes (ARMA)
This is a mixed Autoregressive Moving Average model with p, AR terms and q, MA terms i.e ARMA (p,q). This is denoted by:
where are polynomials in B of finite order p,q respectively. Equation (4) has a unique causal stationary solution provided that the roots of lie outside the unit circle. Using ARMA processes, many real data sets may be approximate in a more parsimonious way by a mixed ARMA model rather than a pure AR or MA process.
2.7. Model Selection Criteria
The Akaike’s (1973) Information Criterion is used for model selection. The criterion says to select the model that minimizes:
AIC = -2 log (maximum likelihood) + 2k
where k =p + q + 1, if the model contains an intercept or a constant term and k = p + q otherwise. The addition of the 2(p + q + 1) or 2 (p +q) serves as a ‘penalty function’ thus ensuring the selection of a parsimonious model. K is the number of parameters in the model. The value of k yielding the minimum AIC specifies the best model. The lower the AIC value, the better the model fit. AIC balances the error of the fit against the number of parameters.
3. Discussion of Results
Time plot of the original series of quarterly rainfall in Ibadan is shown on the figure below.
Fitting AR models of order one to five; the results are as follow:
Fitting an MA models of order one to five; the results are as follows
The results of the analysis for ARMA models are given below
The AIC is a way of selecting a model from a set of models. The chosen model is the one that minimizes the Kullback-Leibler distance between the model and the truth. It is based on information theory, but a heuristic way to think about it is as a criterion that seeks to a model that has a good fit to the truth but few parameters. It is defined as:
Where k is the no of model parameters is the maximum likelihood of residual variance and N is the no of observations.
The table below shows the R Square, Log likelihood and AIC of order of AR, MA and ARMA models.
|Model /Order||R- Square||Log likelihood||AIC|
Note: The figures in bold signifies the models with minimum AIC
The best model for AR model is AR(4) with minimum AIC value of 12.71325, for MA model, MA(4)is better with minimum AIC of 13.02681 while for ARMA model, ARMA(4, 4) proved to be better with minimum AIC value of 12.52327. Hence, among the three models, the ARMA(4,4) is chosen to be the best model, since it has the minimum AIC, it is therefore, used to forecast for the next five years as shown in Table 3.
4. Summary and Conclusion
The analysis of the data used for this study was carried out by using E-view statistical package. It is discovered from the original data that there was a variation and fluctuation in the rainfall distribution over the years in Ibadan. The autocorrelation (AC) and partial autocorrelation (PAC) values in Table 1 indicate positive and negative value, that is, even and odd function which makes one to conclude that the quarterly turnover is not stable due to the occurrence of different variation. The auto-regressive model of order four AR(4), moving average of order four MA(4) and autoregressive moving average of order two ARMA(4,4) have minimum AIC values. In overall, the ARMA (4, 4) proved to be the best fit since it has the least AIC values of 12.52327 among the models, hence the model is used to predict for the next five years in quarters.
From the findings, the forecast values generated show that for ARMA model, it is expected that the distribution of rainfall in Ibadan will gradually increase from first quarter, reach its peak in third quarter and gradually decrease in the fourth quarter.