A Genetic Programming Framework To Measure Complexity
A measure _ that may characterize the complexity of sequential
data is introduced. Complexity here is in the sense that an observed sequence
is difficult to predict. Measuring this complexity is important.
Genetic programming (GP) is used to obtain _. GP searches
for a best-fit model by randomly assembling thousands of equations that mimic
regression models, evaluates their fitness, and reports the fittest one.
Because equations are randomly assembled, they are accidental mathematical
fits that cannot be expected to be meaningful. Formulation of _ is based
on the assumption that GP can successfully reproduce the dynamics of simple
processes while fail in the case of very complex ones (such as white noise).
A GP software that fits time series data (TSGP) is employed
to demonstrate how _ can be used. (TSGP is available at compumetrica.com.)
TSGP computes the mean square error (MSE) for each equation it assembles.
Hypothetically, if a sequence of pseudo-random data (considered complex)
is scrambled MSEobserved ≈ MSEscrambled. MSEobserved < MSEscrambled if
the sequence is deterministic. Heuristically, (MSEobserved/MSEscrambled)
_ 1 for complex data, while (MSEobserved/MSEscrambled) _ 0 for totally predictable
data. Obtaining a large number of independent ratios (MSEobserved/MSEscrambled)
for the same data set is useful in testing the hypothesis: __= 1, __= 0,
or any 0 £ __£ 1. Since the ratios are independent, the following
test-statistic applies:
t = 1/n_(gi - _i) / sgn –_,
with degrees of freedom = (n – 1), and where i = 1, …, n, gi = sample estimates
of _i, and sg is the standard deviation of gi.
Empirical testing of _ suggested that reasonable hints
about the complexity of simulated data with known underlying data generating
process can be obtained. Tested sequences were: pseudo-random, nonlinear-chaotic,
and nonlinear-stochastic.
Using Genetic Programming to Forecast US Residential
Electrical Energy
An integrated statistical-genetic programming modeling
algorithm is proposed to forecast electrical energy used by the US residential
sector. Generally, the amount of electricity used over a given short period
of time (an hour) is referred to as “demand” and is measured in Watts (W).
The amount of electricity used over a lengthy period (say one month) is referred
to as “energy”. Typically it is measured in kilo Watt-hours (kWH = 1,000
Watt-hours). Accurate forecasting of electrical energy provides reliable
forecasts of demand thus reducing the risk of brownouts. Regression models
are typically used in estimating energy equations. In a regression model,
and for example, annual electrical energy is assumed to be a function of
price of electricity, per capita income, heating degree days (a variable
that measures severity of winter coldness), and cooling degree days (a variable
used that measures the severity of summer hotness). Such model provides estimates
of policy parameters utilities and government regulatory agencies rely upon
in energy planning and policy formulation. Estimates of consumers’ responsiveness
to electricity price changes (price elasticity) and income changes (income
elasticity) are invaluable decision-making information a regression provides.
However, accurate forecasts of the dependent variable (energy) are conditional
upon accurate forecasts of the explanatory variables. Genetic programming
(GP) is employed to provide their forecasts here. GP, a heuristic search
technique, is an iterative algorithm that randomly assembles regression-like
equations. These equations are used to breed fitter ones until an equation
is reached that somehow manages to produce a reasonable forecast. However,
it may not be helpful in planning or policy making. Integrating the two techniques
then has the advantage of obtaining needed policy and planning parameters
while securing accurate forecasts. To evaluate the efficacy of forecasts
provided by GP, they are compared to forecasts obtained using ARIMA models
as an alternative.
M.A. Kaboudan
University of Redlands
School of Business
http://newton.uor.edu/facultyfolder/mahmoud_kaboudan
Mahmoud_Kaboudan @ Redlands.edu