A GENETIC ALGORITHM APPROACH TO
PRICING OPTIONS WITH FUTURES-STYLE MARGINING
Presented at the 1998 MFA Annual Meeting
March 19, 1998
by
A. Jay White
Murray State University
Department of Economics & Finance
Murray, KY 42071
(502) 762-4285
jay.white@murraystate.edu
Gay B. Hatfield
The University of Mississippi
Department of Economics & Finance
University, MS 38677
(601) 232-7751
Robert E. Dorsey
The University of Mississippi
Department of Economics & Finance
University, MS 38677
(601) 232-7076
This Version: March 4, 1998
A GENETIC ALGORITHM APPROACH TO
PRICING OPTIONS WITH FUTURES-STYLE MARGINING
An Abstract
Futures-style options are traded on the London International Financial Futures Exchange (LIFFE) but not on exchanges in the United States. Rather than have the option buyer pay the seller a premium, the option is marked-to-market in the same manner as a futures contract. Scant research has addressed the pricing of options with futures-style margining. This study develops an artificial neural network (genetic adaptive neural network) to approximate the option prices, enters economic data into the pricing model, and compares these results to those from an alternate option pricing model.
[Introduction] [Option Models] [Neural Networks] [Data] [Results] [Conclusions]
Download a MS Word 97 version of this paper.
Since the creation of the Chicago Board Options Exchange (CBOE) in 1973, the options market has experienced tremendous growth. In addition to the CBOE, options are now traded on numerous exchanges worldwide, including: the Philadelphia Stock Exchange (PHLX), the American Stock Exchange (AMEX), the New York Stock Exchange (NYSE), the Chicago Mercantile Exchange (CME), and the London International Financial Futures Exchange (LIFFE).
Currently, futures-style options are traded on LIFFE but not on exchanges in the United States. The CME and the Chicago Board of Trade have proposed switching to a futures-style margining system for options on futures, but it has not been approved.
The payment of the option premium is treated differently for futures-style options than for traditional options. With a traditional options contract, the buyer pays the seller an amount called a premium, and the option seller must post margin (this premium is used by the seller as posted margin). The buyer pays the full amount of the premium, and unless the option is exercised, there are no further payment requirements. It is possible that the option seller will have to post additional funds if the option's value increases. This is similar to the short investor's position in a typical futures contract. For the option buyer to realize any gain, however, the option must be exercised or sold. In some cases (currency or cross-rate options), the margin requirement for the option writer can be satisfied with a letter of credit from an approved bank.
An alternate method of handling these contracts is to have both the buyer and seller post margin and then require marking to market as with futures contracts. With this type of system, the option buyer (long position) and the option seller (short position) would deposit funds (initial margin) in a margin account. At the end of each trading day, the option value would be marked to market, and the margin account would be adjusted to show the investor's gain or loss. If there is an increase in the option's price, the short investor's margin account would be reduced while the long investor's margin account would be increased. The reverse would occur if there is a decrease in the option's price. With this system the long position (option buyer) still has the right to exercise the option.
Although much research has been done in the area of option pricing, there still exists a number of problems related to estimating or predicting option prices. To date, few researchers have addressed the pricing of options with futures-style margining. None of the approximation techniques which have been developed apply uniformly to the many different types of options. Some of these problems include but are not limited to:
The purpose of this study is to utilize Genetic Adaptive Neural Networks (GANNs) to develop a method of pricing futures options with futures-style margining. The objectives of this study will be to: (1) Develop an artificial neural network that will accurately approximate the price of futures options with futures-style margining; (2) Examine the effects of incorporating additional economic data into the pricing of futures options when using GANNs; and (3) Compare the GANNs ability to price futures options with futures-style margining with a current option pricing approximation technique.
The rest of this paper is organized as follows: Section II reviews options models. The methodology of neural networks is described in Section III, and Section IV pressents the data. Results are found in Section V, and the conclusions are discussed in Section VI.
Many of the option pricing models used today are an extension of or were derived in a manner similar to that of the model developed by Fischer Black and Myron Scholes (1973). Black-Scholes (hereafter referred to as B-S) developed the first closed-form solution to option pricing. Cox, Ross, and Rubinstein (1979) developed the binomial option pricing model (BOPM), also developed independently by Rendleman and Bartter (1979). The BOPM model uses the same parameters to price options as does the B-S OPM (stock price, exercise price, time to expiration, interest rate, and volatility) and was derived using the same arbitrage principals. The main difference in the two models is that in the BOPM the time to expiration is partitioned into discrete intervals of equal length and in each period the stock price can either increase or decrease (follows a binomial process). As the number of time periods approaches infinity (alternatively, as the time interval becomes smaller), the BOPM converges to the B-S model. The B-S model is a continuous time model.
As with the B-S model, the BOPM is appropriate only for European-style options that do not pay dividends. Adjustments to the model must be made to account for early exercise and dividends. When a stock is expected to pay one known dividend during the life of the option, the compound option pricing model can be used to compute an exact price for said option. For other American options, such as options on an underlying instrument that pays a continuous dividend, there is no closed form solution. Therefore, some approximation technique must be used.
Lieu (1990) was the first to apply the techniques utilized in the Black-Scholes (1973) analysis to derive put and call option pricing formulas for futures-style options. He noted that the traditional B-S formula is not appropriate for futures-style options, even if they are European options, because of the marking to market. Lieu purports that his model has intuitive appeal as it is no longer necessary to consider borrowing to pay for an option premium, or investing premiums from short options; however, he concludes that his model is not applicable to the options contracts traded at LIFFE because LIFFE trades options on foreign-currency cash not futures as required by his model. In his model, the interest rate factor drops out, and the early-exercise feature of American futures options no longer matters for options with futures-style margining. Accordingly, the values of both European American futures options converge. As futures-style margining turns an option on a futures contract into a futures contract on a stock-style option, a futures-style option can be considered more of a futures contract than an option contract.
Kuo (1991) derives a valuation model for a futures-style option contract on futures. His assumption as to the value of his hedge portfolio at time t differs from that of Lieu. Lieu (1990) sets that value equal to zero by assuming that the amount of margin that has to be posted by each side is trivial. Kuo does not adhere to that assumption and states that the marking to market required for futures-style options means that day-to-day contract gains/losses will have to be invested (or borrowed) at uncertain future interest rates.
Chen and Scott (1993) show that Lieu’s (1990) results on futures options with futures-style margining also hold in a general equilibrium model with stochastic interest rates. This implies that Lieu’s model can be applied to European and American futures-style options traded on LIFFE. Chen and Scott extend Lieu’s results to interest rate futures options. The authors also modify several existing models for interest rate futures options to allow for futures-style margining.
Chen and Scott first modify Black’s (1976) model for pricing Eurodollar futures options. The authors state that:
The final settlement price for ED futures is determined by taking an average of the London Interbank Offer Rate (LIBOR) on the delivery day and subtracting the rate from 100. Prior to delivery, traders calculate the futures rate, which is 100 minus the futures price. Conventional ED futures options are frequently priced by applying Black’s model: the futures rate is assumed to have a lognormal distribution and the short term interest rate is assumed to be fixed.
Therefore, the authors derive the value for a ED futures call as
, (1)
where
and (1a)
, (1b)
and where R(t) is the futures rate which is equal to 100 - f(t), f(t) is the futures price, and K is the strike price. s is the volatility for the futures rate. Because there is an inverse relationship between interest rates and the prices of interest bearing securities, the above futures call option is equivalent to a put option on the futures rate.
In order to derive an option valuation model for ED futures options with futures-style margining, the authors assume that the futures rate is lognormally distributed. Further, they assume that R is a geometric Brownian motion process such that R = ex and x is determined by the process: dx = m (x)dt + s dw. Similar to Black-Scholes (1973), Black (1976), and Lieu (1990), the authors form a riskless hedge portfolio. This portfolio consists of a position in the futures contract and an offsetting position in the futures option. This portfolio requires a zero investment; and, in equilibrium with no risk, it should provide a zero return. The resulting p.d.e. is given as:
. (2)
The solution of the above p.d.e. subject to the boundary condition that C = Max(0,(100 - R) - K) is
, (3)
where
and (3a)
, (3b)
This is similar to the model in equation 1 except the term e-r(T-t) has dropped out.
Chen and Scott conclude that futures options with futures-style margining should not be exercised early because their prices should exceed the intrinsic value prior to expiration. This leads to the conclusion that American futures options will have the same prices as comparable European futures options, and one can price the American futures options with a European pricing model.
The above analysis has two shortcomings. First, all of the above adjustments ignore the impact of marking-to-market on traders’ cash flows. Second, these models are applicable only for futures-style options on non-coupon bearing securities. While this second short-coming is not a problem for options on T-bills, Eurodollars, and other non-coupon bearing securities, it does limit the model’s scope.
A. Overview
Neural network applications for finance include assessing the risk of mortgage loans [Collins, Ghosh, and Scofield (1988)], rating the quality of corporate bonds [Dutta and Shekhar (1988)], predicting financial distress [Salchenberger, Cinar and Lash (1992), Coats and Fant (1993) and Altman, et. al. (1994)], predicting bond re-ratings [Hatfield and White (1996)], and predicting fluctuations of stock price movements [Bower (1988)]. According to Shandle (1993), companies such as General Electric, American Express, and Chase Manhattan Bank are using neural networks to screen credit applications, spot stolen credit cards, detect patterns which may indicate fraud and predict commodity and stock prices, bond ratings, and currency trading trends.
A neural network (NN) imitates neural biological functions in learning relationships between independent and dependent variables; therefore, a NN is therefore a simplified model of the human brain which is capable of learning and generalization. NNs are made up of processing elements (often called neurons, nodes, or cells) and connections which are organized in layers. Generally you have an input layer, one or more hidden layers and an output layer. A feedforward neural network has two or more layers, each of which gets input from the former layer. Output is then sent to the following layer. The NN takes a set of inputs and maps them to some set of outputs with the connections between neurons having some weight which influences an output cell. These weights are "learned" by the network through a training process in which a training sample is presented to the network.
For this study a three layer (i.e. one middle or hidden layer) feedforward, neural network will be developed to approximate the process by which call and put option prices for options with futures-style margining are determined. This is a problem for which no closed form solutions exist (this is one of numerous options for which no closed form solutions exits). This NNs ability to forecast actual option prices will be tested and compared to an existing option pricing methodology. The inputs for the NN will be those that are used in the Chen & Scott model discussed above. Figure 1 shows a possible network structure utilizing the Chen & Scott inputs to produce call and put option prices.
Figure Figure 1: A Multilayer Feedforward Neural Network
Following White (1989), for a three layer network, any middle layer node will receive a weighted sum of all the input nodes plus a bias and produces some output signal
j=1, . . . , k, I=0, . . , n, (4)
where F is the transfer function, xi is the ith input signal, wij is the strength of the connection from the ith input node to the jth middle layer node and hj is the middle or hidden layer node. The transfer function, F , is applied to each neuron’s activation value to generate each neuron’s output. Many different types of functions can serve as the transfer function. A typical transfer function is the non-linear, continuously differentiable sigmoid transfer function
(5)
which produces an S-shaped curve with values assigned between 0 and 1.

Figure 2: Sigmoid Transfer Function
The signals for the hidden nodes are then sent to the output nodes in a similar fashion to Equation 5 above and produce a signal
(6)
where there are q hidden nodes, bj is an output weight, ok is the kth output node, and h0 is always one so that b0 provides a bias. By substituting equation (7) into (5) we have
, (7)
where output (y) is shown as a function of the input vectors (x) and weights (q). White (1989) has shown that the output function
(8)
can provide an accurate approximation to any function of x provided q (the number of hidden nodes) is large enough. White further states that due to this property, "hidden-layer feedforward networks are useful for applications in pattern recognition, classification, forecasting, process control, and image compression and enhancement."
In order to learn, a NN needs to find weights (q ) that will approximate the underlying function. Many studies have utilized Back-propagation (BP) of Rumelhart et al. (1986) to accomplish this task [Salchenberger, Cinar and Lash (1992), Coates and Fant (1993), and Altman et al. (1994) to name a few]. Learning is accomplished through a Back-propagation Neural Network (BPNN) by taking the network’s errors (BPNN output - actual values) and updating the connection weights using
t=1, 2, . . . , (9)
where l is a learning rate, Ñ f is the gradient (vector of partial derivatives of f with respect to the weights, q ) and Yt is the target outcome. The BPNN is therefore a point-to-point gradient-search technique. When a correct observation is encountered, weights are strengthened; and when errors are encountered, weights are weakened. Although BPNNs are widely used, they have a tendency to become stuck at local, rather than global, optimal solutions. Altman et al. (1994) found "illogical weightings of the indicators" in their corporate distress prediction study. In one case, they found that if the level of a firm’s liquidity deteriorated (usually a bad sign), the NN showed an improvement in output. In some instances the improvement was enough to move what was an unsound firm into the category of healthy firms. This is clearly counter-intuitive. Although the NNs in Altman’s study outperformed other prediction methods in many instances, he found the illogical weightings to be "unacceptable." It is possible that the illogical weightings could have been the result of the NN arriving at a local solution, as opposed to a global one. Another problem related to BPNNs is "overtraining". When overtraining occurs, the NN is not able to generalize (and therefore not able to predict well out-of-sample).
Because of the problems associated with BPNNs, the Genetic Algorithm (GA) optimization technique of Dorsey and Mayer (1994) will be utilized for network learning in this study. A GA uses evolutionary concepts in the optimization process. Because the GA is an intelligent global search technique, the problem of arriving at local optima is addressed. According to Kean (1995)
GAs exceed other optimization procedures in robustness. Their advantage lies in a more thorough searching of a global solution space through avoidance of getting stuck at local optimums.
The GA was found to perform well when optimizing NNs by Dorsey, Johnson, and Mayer (1995). Furthermore, Sexton, Johnson, and Dorsey (1995) found the GA optimized NN to outperform the BPNN when testing out-of-sample, thereby addressing the problem of "overtraining".
A genetically optimized NN is trained by starting with a number of different sets of randomly selected weights (as opposed to one set with a BPNN). In keeping with the biological terminology, each of these sets may be thought of as a "chromosome" and the individual weights in a particular chromosome may be thought of as "genes". These "chromosomes" each represent a possible solution to the problem being analyzed. The NN is then trained with each set of weights (chromosomes) after which, the fitness of each chromosome in the initial population is evaluated (based on which sets of weights best minimized the objective function). Chromosomes with high levels of fitness are chosen as "parents" for the "reproduction" stage where a new generation of chromosomes is created to present to the network. The new generation is created through "crossover" and "mutation". Crossover is accomplished by combining the genes of two parent chromosomes (combining the weights of two parent weight vectors) in one or more pre-specified places. In this case, crossover involves exchanging the first half set of weights of an odd-number new chromosome with the second half set of weights of the following even-numbered new chromosome. Mutation is accomplished by assigning a random value to a randomly selected gene. The mutation process allows for increased robustness of the process. The new generation is then presented to the network, and the process continues until convergence.
With most predictive models the developer must specify the functional form of the relationship between the variables involved. For example, when applying Black’s (1976) OPM to option pricing, the assumption that futures prices are lognormally distributed is required. With a neural network, knowledge of the true functional form of the underlying relationship is not necessary; and a neural network can be used to approximate any continuous function to any desired degree of accuracy, i.e. multilayer feedforward networks are universal approximators [Hornik, Stinchcombe, and White, (1989)]. Basically, the NN is developing an internal representation of the relationship between the independent and dependent variables; and therefore, no a priori assumptions about the underlying distributions are necessary [Salchenberger, Cinar and Lash (1992)]. This characteristic can lead to reduced development time because the researcher doesn't have to spend time and resources trying to model the underlying relationships.
A further characteristic of NNs is generalization. Because a neural network can generalize, it can handle variations in inputs (such as partial or imperfect data) and still produce a correct output. Furthermore, according to Altman et al. (1994), NNs are able to handle imprecise variables and changes in model relationships over time. Therefore, the NN is able to adapt gradually to the appearance of new cases which may contain valuable information about possible changes in the relationships among variables.
Neural networks do have their disadvantages. One of the criticisms of neural networks is that the internal structure of the model is unknown. According to Eliot (1995)
One of the traditional disadvantages of neural networks is the incredible complexity of the neural interconnections’ internal nuances and their logical meaning in terms of the problem to be solved. In most instances, users can only inspect and understand the output of the neural network, while the internal guts are treated as a mysterious black box that magically (and hopefully) generates the right kinds of outputs for the inputs provided.
Thus, most reliability tests of NNs have been limited to examining the NN's output. Other objections to neural networks include the large computational resources necessary and the lack of a formal theory for determining the optimal network topology [Salchenberger, Cinar and Lash (1991)].
This study develops a General Adaptive Neural Network (GANN) to price put and call options on 3-month Eurodollar futures. The inputs are the futures rate F(t), the strike rate (100 – K), the annualized volatility of the underlying futures contract (n t ), and the time to maturity (t ). Also, a number of network topologies are tested. All of the networks tested have one input layer consisting of four (4) input nodes, one hidden layer, and one output layer consisting of one node. Based on minimizing MSE, it was determined that the neural networks which provided superior performance for this particular data set were those with 18 hidden layer nodes. The networks were trained until no further improvement in SSE was obtained, then they were trained an additional 30,000 generations in a fine-tuning process. After training, the GANNs results are compared to the actual option prices and to the CS OPM approximation.
The data for this study are the 3-month Eurodollar futures contracts and the option on 3-month Eurodollar futures which were provided by the LIFFE. The underlying security on the 3-month Eurodollar interest rate futures option is one 3-month Eurodollar futures contract, with a contract size of $1 million. Delivery months for the futures are March, June, September, and December; and the delivery day is the first business day after the last trading day. In turn, the last trading day is two business days prior to the third Wednesday of the delivery month. Cash settlement is based on the Exchange Delivery Settlement Price (EDSP). The EDSP is based on the British Bankers’ Association Interest Settlement Rate (BBAISR) for 3-month Eurodollar deposits at 11:00 am on the last trading day. The settlement price is 100 minus the BBAISR. The minimum size price movement is 0.01 which equates to $25.00.
The futures option is American style and can be exercised on any business day prior to 5:00 p.m. Delivery must be made on the first business day after the exercise day. Expiration occurs at 12:30 p.m. on the last trading day. The last trading day for the futures option is the last trading day of the 3-Month Eurodollar futures contract. The minimum price movement for the futures option is .01 ($25) and the exercise price intervals are .25 (0.25 percent).
When a futures options on LIFFE is purchased, the buyer does not pay the option premium but posts an initial margin. According to LIFFE’s Summary of Futures and Options Contracts
Option positions are marked-to-market daily. This marking-to-market generates positive or negative variation margin flows. If an option is exercised by the buyer, the buyer is required to pay the original contract price to the Clearing House and the Clearing House will pay the original option price to the seller on the following business day. Such payments are netted against the variation margin balances of the buyer and seller by the Clearing House.
The margin account has a maintenance margin which is the level at which the trader must infuse more funds (called variation margin) into his margin account if losses ensue. Any amount in the margin account in excess of the initial margin may be withdrawn by the trader.
The data cover the period from September, 1990, through July 1994. Due to the size of this data set (143,636 observations) it was decided to examine the period covering January 1994 through July 1994. The first date for which trading information is available is January 4, 1994, and the last day is July 29, 1994. There are 10,231 observations for this period. Descriptive statistics for this data set are provided in Table 1.
R(t) is defined as the futures rate and is calculated as 100 - f(t) where f(t) is the futures price. s is the implied volatility as calculated by Black’s (1976) futures option pricing model. Implied volatility is that volatility which forces the option pricing model’s calculated price to equal the observed market option price. u is used to denote the annualized volatility (standard deviation) of the futures rate and is calculated using:
(10a)
then
(10b)
and
(10c)
where PRt is the price relative on day t, Ft is the futures rate on day t, Ft-1 is the futures rate on day t - 1, and
, is the variance of the price relative. For this study, the previous 60 trading days were used (T = 60) to compute the daily variance. Finally, the daily variance is used to compute the annualized volatility (standard deviation) as
, (10d)
where it is assumed that there are 250 trading days in a year. The remaining variables, t , K, C, and P represent the option’s maturity stated as a fraction of a year, the option’s strike (exercise) price, market call price and market put price, respectively. Frequency distributions for the futures rate, annualized volatility, call option price, and put option price are provided in Figures 3 - 6 (See attachments).
A training data set and five different validation (holdout) sets were drawn from the data. For the training set (TRAIN1), 2,000 values were randomly selected without replacement from the 8,887 observations in the Real Data set over the period January 4, 1994, through June 9, 1994.
The remaining observations over this time period were segregated for a validation sample (HOLDOUT1). The HOLDOUT1 sample was drawn from the same data set as the training data set, TRAIN1, to test the trained GANNs ability to interpolate. To test the trained (optimized) GANNs ability to price options outside the training set (extrapolation), four additional validation samples were constructed. HOLDOUT2 consists of all observations over the period July 1 through July 8 and contains 384 observations. HOLDOUT3 consists of all observations over the period July 11 through July 15 and contains 320 observations. HOLDOUT4 and HOLDOUT5 cover the periods July 18 through July 22 and July 25 through July 29 respectively, and each contains 320 observations (all data sets are from 1994). Descriptive statistics for these samples are provided in Table 2.
V. Results
The results of the neural networks’ ability to price futures-style interest rate options will be presented in this section. Call and put option results will be presented separately.
The results of the GANNs ability to approximate call options contracts are presented in Table 3.
MSEs and MAEs are reported for both the GANN approximation and the CS OPM approximation. Error terms were calculated as the neural network predicted call price (CGANN) minus the target price (the market price, C) or as the CS predicted price (CCS) minus the target price. The data was divided into a number of sub-samples based on the degree of moneyness. A call option was determined to be deep in-the-money or deep out-of-the-money if M > 1 or M < -1, respectively. The variable M is defined as the strike rate minus the futures rate for call options.
The results for the entire training sample (2000 observations) indicate a MSE and MAE of 0.00047 and 0.01589 for the GANN and 0.02012 and 0.07743 for the CS OPM. Recall that the minimum size tick-move for the 3-month Eurodollar futures option is .01. This means that the average pricing error from the GANN is almost as small as the minimum price move allowed by LIFFE. The average error for the CS OPM is seven (7) times the minimum tick move.
Both models performed best when approximating out-of-the-money (M < -.01) and deep out-of-the-money (M < -1) options and worst when approximating the at-the-money category (-.01 < M < .01). This is true for both the training sample and the holdout sample (6887 observations). The errors are smaller for the GANN than for the CS OPM for all the sub-samples. For most of the sub-samples, the CS MAE is larger than the GANN MAE by a factor of 5. This implies that the average error for the CS OPM is five times the average error for the GANN.
Both paired t-tests and Wilcoxon signed-ranks tests were employed to determine if the observed differences were significant. The hypotheses tested were:
HA0: The GANN call option price is the same as the market call option price. Alternatively, CGANN - C = 0.
HA1: The GANN call option price is not the same as the market call option price. Alternatively, CGANN - C ¹ 0.
and
HB0: The error produced by the GANN approximation is greater than or equal to the error produced by the CS OPM approximation. Alternatively, CGANN - C ³ CCS - C.
HB1: The error produced by the GANN approximation is less than the error produced by the CS OPM approximation. Alternatively, CGANN - C < CCS - C.
Results from the paired-comparison t-tests are presented in Table 4.
Hypothesis A cannot be rejected at the .01 level for any of the sub-samples in the training data set or in the holdout set. At the .05 level, the hypothesis of equivalence between the GANN approximation and the market price can only be rejected for the out-of-the-money category in the holdout sample. Hypothesis B is rejected at the .01 level for all but one category, at-the-money calls in the training sample. At the .05 level, the hypothesis that the GANN errors are greater than or equal to the CS OPM errors is rejected for every sub-sample in both the training and holdout data sets.
The Wilcoxon signed-ranks test results are presented in Table 5.
At the .05 level for the training data set, there is insufficient evidence to reject Hypothesis A for any sub-sample and Hypothesis B is rejected at every sub-sample. This is true at the .01 level also, with the exception of the at-the-money category, in which Hypothesis B is rejected. For the holdout sample, the hypothesis of equivalence between the GANN approximation and the market price is rejected for the deep out-of-the-money, out-of-the-money and complete sample categories. At the .05 level, Hypothesis A is rejected for the deep in-the-money and just out-of-the-money categories. Hypothesis B is strongly rejected for all sub-samples which implies that the errors produced by the GANN approximation are smaller than the errors produced by the CS approximation.
To gain further insights into the performance of the models, OLS regressions are utilized to test for the existence of pricing biases. The regression results from these tests are presented in Table 6.
In examining the CS OPM, the existence of pricing biases with respect to degree of moneyness, time to maturity, and volatility is confirmed. The model with volatility (u ) as the explanatory variable has the highest R2, and the model with time to maturity (t ) as the explanatory variable has the smallest R2. For the GANN, the existence of pricing biases cannot be confirmed at the .01 level for the training sample, nor the holdout sample.
Altogether, the evidence for call options is straightforward. The pricing errors produced by the GANN are not significantly different from zero with the exception of the out-of-the-money, deep out-of-the-money and perhaps overall sub-samples. Further, the errors produced by the GANN are smaller (as measured by MSE and MAE) than those produced by the CS OPM. The evidence supports the hypothesis that the GANN errors are significantly smaller than the CS OPM errors. Finally, the CS OPM appears to produce pricing biases related to the degree of moneyness, the time to maturity and the volatility of the underlying futures contract. Such spurious relationships were not found with the GANN call approximations.
To test the GANN’s ability to approximate put prices on the 3-month Eurodollar futures option traded on LIFFE, a neural network consisting of four (4) input nodes, eighteen (18) hidden layer nodes and one output node was developed and trained on the 2000 observation training data set. As with call options, when this network’s training was completed, it was presented with the 6887 observation holdout data set (HOLDOUT1). Put price approximations were also calculated with the CS OPM for both data sets. Error terms were calculated as the approximation price minus the actual market price for both models (PGANN - P and PCS - P for the GANN and the CS OPM respectively). MSEs and MAEs are presented in Table 7.
For the entire training data set (2000 observations) the GANN’s MSE and MAE are 0.00044 and 0.01598 as compared to 0.02012 and 0.07743 for the CS OPM. These numbers are similar to those reported for call options. Both the GANN and the CS OPM produced the smallest errors (as measured by MAE) in the deep in-the-money (M > 1) category. The GANN also had small errors in the deep out-of-the-money (M < -1) and the in-the-money (M > .01) training sample categories. The CS OPM’s next smallest errors were recorded in the in-the-money category. The largest errors for both models were in the at-the-money sub-sample (-.01 < M < .01).
For the holdout sample, the largest error for both models was in the at-the-money sub-sample. The smallest errors for both models were in the in-the-money and deep in-the-money sub-samples. The errors are smaller for the GANN than for the CS OPM in all sub-samples and, as measured by MAE, are often smaller by a factor of 5 or more. The smallest difference in MAE for the two models occurred in the deep in-the-money category. For this sub-sample the MAE was 0.0157 and 0.01377 in the training data set and 0.01608 and 0.01328 in the holdout sample for the GANN and the CS OPM, respectively.
In order to gain further insights on the observed pricing errors, both paired comparison t-tests and Wilcoxon signed-ranks tests were employed to determine if the errors were significant. Once again, examination of the errors indicated the possibility of non-normality, necessitating the use of non-parametric tests. The hypotheses tested were:
HA0: The GANN put option price is the same as the market put price. Alternatively, PGANN - P = 0.
HA1: The GANN put option price is not the same as the market put price. Alternatively, PGANN - P ¹ 0.
and,
HB0: The error produced by the GANN approximation is greater than or equal to the error produced by the CS OPM approximation. Alternatively, PGANN - P ³ PCS - P.
HB1: The error produced by the GANN approximation is smaller than the error produced by the CS OPM approximation. Alternatively, PGANN - P < PCS - P.
Hypothesis A is used to determine if the GANN approximation is significantly different from the market put price while Hypothesis B is used to determine if the GANN is a "better" (in terms of smallest error produced) approximator of market prices than the CS OPM.
Results from the paired-comparison t-tests are presented in Table 8.
At the .01 level, the hypothesis of equivalence between the GANN approximation and the market price (Hypothesis A) can be rejected only for the just in-the-money (.01 < M < 1) category for the training sample. For the holdout sample, Hypothesis A can be rejected at the .01 level for the just in-the-money and the deep in-the-money (M > 1) sub-samples only. Hypothesis A can be rejected for the deep in-the-money category at the .05 level in the training sample.
Hypothesis B is rejected at the .01 level for every sub-sample in the holdout set. In the training set, Hypothesis B is rejected at the .01 level for every sub-sample except the at-the-money (-.01 < M < .01) category. The hypothesis is rejected for this category at the .05 level however. This is compelling evidence that the GANN is indeed a better approximator of put prices than the CS OPM.
The results for the Wilcoxon signed-ranks tests are presented in Table 9.
Hypothesis A can only be rejected at the .01 level for the just in-the-money (.01 < M < 1) training sample category. For the holdout sample, the hypothesis of equivalence can be rejected at the .01 level for the in-the-money (M > .01) and the just in-the-money categories only. At the .05 level, Hypothesis A can be rejected for the deep in-the-money (M > 1) holdout category only. This is further evidence that the GANN approximation is not significantly different from the actual market put price.
As can be seen from Table 9, the errors produced by the GANN are significantly smaller than the pricing errors produced by the CS OPM. For the holdout sample, Hypothesis B is rejected at the .01 level for every sub-sample. Furthermore, for the training sample, Hypothesis B is rejected for every sub-sample at the .01 level, except for the at-the-money category, which is rejected at the .05 level.
Tests for pricing biases were conducted for the two models tested, the results of which are presented in Table 10.
For the CS OPM, the existence of systematic pricing errors is confirmed for all of the parameters tested in both the training and holdout samples. The regression models with the volatility as the explanatory variable produce the largest R2 measures. The regression model with time-to-maturity as the dependent variable produces the smallest R2 in the holdout sample for the CS OPM. None of the explanatory variables are significant in any of the regression equations for the GANN. This is true for both the training and holdout data sets. Thus, the results imply that the moneyness bias, the maturity bias and the volatility bias are not present in the GANN but do exist and are confirmed in the CS OPM. Collectively, the evidence for put options is similar to that found with call options. The pricing errors produced by the GANN were not found to be significantly different from zero for most of the sub-samples examined. The errors do appear significant for the in-the-money, just in-the-money and possibly the deep in-the-money categories. Further, the errors produced by the GANN are smaller (as measured by MSE and MAE) than those produced by the CS OPM. The evidence supports the hypothesis that the GANN errors are significantly smaller than the CS OPM errors. Finally, the CS OPM appears to produce pricing biases related to the degree of moneyness, the time to maturity and the volatility of the underlying futures contract. As with call options, such spurious relationships were not found with the GANN put approximations.
C. Additional Analysis of GANN Methodologies
One of the advantages of neural networks is there ability to incorporate additional inputs, if necessary, to improve predictive performance. The inputs in the above analysis were limited to those used by the CS OPM, as the goal was to test the GANN’s option pricing abilities using the same inputs as the CS OPM. As the superior performance of the GANN has been established, new models incorporating additional inputs can be tested.
To test the GANN’s ability to incorporate additional economic data, additional neural networks were developed that had six (6) input layer nodes. These inputs were the original four inputs utilized in the original GANNs plus two additional variables. These new variables included a proxy for the degree of moneyness (M), calculated as (100 - K) - F(t), and the 3-month Eurodollar (ED) interest rate (ED03). The 3-month ED rate was obtained from the Chicago Federal Reserve’s on-line market data, available through that bank’s World Wide Web page.
The MSEs and MAEs for the new models (denoted GANN2) are presented in Table 11.
To ease comparison of the new networks with the original networks, the original GANNs results are repeated in this table also (denoted GANN1). To strictly examine the effect of incorporating additional inputs, both GANNs have 18 hidden layer nodes.
As can be seen from Table 11, the errors produced by the GANN2 models are often half the size of the errors produced by the GANN1 models. In some sub-samples the MAE is less than the minimum sized price move for the 3-month ED futures options examined. For call options, the MAE is less than .01 (the minimum tick move allowed) for the deep in-the-money, out-of-the-money and deep out-of-the-money sub-samples. For put options, the MAE is less than .01 for in-the-money, deep in-the-money, and deep out-of-the-money options. Thus, the average error produced by the GANN2 models is less than the minimum price move for the above mentioned options. Also, the GANN2 errors (as measured by MSE and MAE) are smaller than the GANN1 errors for both call and puts in all sub-samples.
Both parametric and non-parametric tests were conducted to determine if the observed error differences were statistically significant. The hypothesis tested was:
H0: The average absolute pricing error produced by the GANN2 models are greater than or equal to the average absolute pricing error produced by the GANN1 models. Alternatively, CGANN2 - C ³ CGANN1 - C and PGANN2 - P ³ PGANN1 - P.
H1: The average absolute pricing error produced by the GANN2 models are less than the average absolute pricing error produced by the GANN1 models. Alternatively, CGANN2 - C < CGANN1 - C and PGANN2 - P < PGANN1 - P.
Paired comparison t-tests and Wilcoxon signed-ranks test results are presented in Table 12.
In reviewing Table 12, it is clear that the observed differences are statistically significant for many of the sub-samples. Utilizing the paired-comparison t-test, at the .05 level, the hypothesis that the GANN2 errors are greater than or equal to the GANN1 errors is rejected for the complete sample (6887 observations) and the deep out-of-the-money categories for calls and for the in-the-money category for puts. At the .01 level, the hypothesis is also rejected for out-of-the-money and just out-of-the-money calls and just in-the-money and deep in-the-money put options.
Based on the Wilcoxon test, the hypothesis can be rejected at the .05 level for call options in every category except in-the-money, just out-of-the-money and at-the-money options. In addition, the hypothesis can be rejected at the .01 level for deep out-of-the-money calls. For put options, the hypothesis is rejected at the .05 for every category except deep out-of-the-money and at-the-money options. In addition, the hypothesis can be rejected at the .01 level for the complete sample, in-the-money, just in-the-money and out-of-the-money categories. It is apparent from these results that the GANN has been able to reduce pricing errors by incorporating additional economic data as neural network inputs.
A final test of the GANNs developed in this study is based on their ability to generalize and handle variations in input. This is accomplished by presenting the trained GANNs with new information. Recall that four additional holdout data sets were constructed (HOLDOUT2 through HOLDOUT5). HOLDOUT2 consists of all observations over the period July 1 through July 8 and contains 384 observations. HOLDOUT3 through HOLDOUT5 each have 320 observations and cover the periods July 11 through July 15, July 18 through July 22, and July 25 through July 29. Each holdout sample progressively moves farther away from the training data in time.
Table 13 reports the pricing errors (as measured by MAE) for the different holdout samples. The errors increase through time, confirming a priori expectations.
|
Table 13 COMPARISON OF GANN MEAN ABSOLUTE ERRORS FOR VARIOUS HOLDOUT SAMPLES |
||||||
|
|
|
Calls |
Puts |
|||
|
Holdout Sample |
N |
GANN1 |
GANN2 |
GANN1 |
GANN2 |
|
|
HOLDOUT2 |
384 |
0.02282 |
0.01483 |
0.02937 |
0.02459 |
|
|
HOLDOUT3 |
320 |
0.02736 |
0.01801 |
0.03450 |
0.02900 |
|
|
HOLDOUT4 |
320 |
0.03107 |
0.02607 |
0.03923 |
0.03405 |
|
|
HOLDOUT5 |
320 |
0.03542 |
0.02820 |
0.04141 |
0.03615 |
|
|
GANN1 has 4 input nodes and 18 hidden layer nodes while GANN2 has 6 input nodes and 18 hidden layer nodes. |
||||||
Referring to Table 2 presented earlier, the distribution of the inputs for the above holdout samples appears to be significantly different from those for the training data set. Thus, it was expected that the pricing errors would be larger for the above holdout samples than they were for the HOLDOUT1 sample. The errors reported in Table 13 are also statistically different from zero at a .01 level of significance. It should be noted, however, that the errors produced by the GANN1 and GANN2 models are significantly smaller than those produced by the CS OPM. Consequently, although the errors do increase in size through time, the GANN still provides better call and put price approximations than the CS OPM.
The above analysis indicates the need to update the neural network connection weights on a periodic basis, as new information becomes available. This could be done on a weekly or daily basis. The ability to incorporate new information easily is, in fact, one of the advantages of neural networks. With traditional option pricing models, a priori assumptions about the underlying distribution of the independent variables must be made. If it turns out these assumptions are invalid, an entirely new model must be derived. With a neural network, any changes in market forces that cause the relationship between the input variables and the output variables to be altered or that change the distribution of the variables themselves, is easily captured by retraining the network to incorporate this new information.
This research examined the ability of a Genetic Adaptive Neural Network to accurately approximate prices for the 3-month eurodollar futures option traded on the London International Financial Futures and Options Exchange (LIFFE). Neural networks were developed that use the futures rate, the strike rate (100 - the strike price), the time to maturity, and the historical volatility of the underlying futures contract as inputs. Collectively, the evidence confirms that the GANN was able to accurately approximate the real call and put values for the 3-month futures option traded on LIFFE as the pricing errors produced by the GANN were not found to be significantly different from zero in many of the sub-samples. Although significant pricing errors were found for some of the sub-samples examined, the magnitudes of the errors were small.
The errors produced by the GANN were found to be smaller than those produced by the CS OPM. In many cases, the mean absolute error generated by the CS OPM was 5 times larger than the mean absolute error generated by the GANN. Also, the CS OPM appears to generate pricing biases with respect to the degree of moneyness, the time to maturity, and volatility. As an additional advantage of the GANN over the CS OPM, such spurious relationships were not found in the GANN models.
Additionally, the GANNs ability to incorporate additional economic information was tested. Neural networks were developed that included two additional input variables, a proxy for the degree of moneyness and the 3-month Eurodollar interest rate. These networks generated pricing errors (as measured by MAE) that were half the size of the pricing errors produced by the original networks. Furthermore, many of the errors were less than .01. The significance of this fact is that the minimum allowable price move for the 3-month ED futures option traded on LIFFE is .01. This means the average error generated by the GANN is less than the minimum tick move for the options being examined.
Finally, the neural networks were presented with new data to test their ability to generalize. As expected, the pricing errors increased as the input characteristics from the new data diverged from the original training data characteristics. It is important to note, that under these conditions, the GANN still generated errors that were smaller than the errors produced by the CS OPM. The results of this final analysis indicate the need to update the neural network connection weights on a periodic basis as new information becomes available.
The results of this study could be of great value to an investor (either a potential writer or buyer of 3-month Eurodollar futures options) as there are no closed form solutions for valuing these types of options. This requires the investor to utilize some type of approximation that is both accurate and fast. The problem is that the available approximation techniques are not very accurate for a variety of reasons. According to Hull (1993):
"Interest rate options are more difficult to value than stock options, currency options, index options, and most futures options. This is partly because we are dealing with a whole term structure – not a single variable. It is also partly because the behavior of interest rates is relatively complicated. . . . Many of the yield curve models that have been proposed have the disadvantage that they are not consistent with the term structure of interest rates at the time the model is built."
At the time of this writing, very few studies (Lieu [1990], Kuo [1991], and Chen & Scott [1993]) have addressed this type of option. The studies that have pertained to this issue disagree as to whether or not a risk-free rate should be included in the option pricing model. The consensus is that if both the option buyer and writer (seller) have to post equal margins, the risk-free rate should drop out.
If an investor were to utilize a neural network to approximate option prices, many of the above issues would be addressed. An investor could train his or her own network, periodically updating the weights. The connection weights could then be used to approximate call and put option prices. Although there are computation costs involved with training a network, there are no computational costs associated with evaluating the network. In fact, the connection weights and input variables could be used in conjunction with a spreadsheet to produce an instantaneous price approximation. The GANNs developed in this study meet both the investor’s criteria. The approximations are both fast and accurate.
Altman, E., M. Giancarlo, and F. Varetto, "Corporate Distress Diagnosis: Comparisons Using Linear Discriminant Analysis and Neural Networks (the Italian experience), Journal of Banking and Finance,18 (1994) 505-29.
Black, F., "The Pricing of Commodity Contracts," Journal of Financial Economics, 3 (March 1976), 167-79
Black, F., and M. Scholes, "The Pricing of Options and Corporate Liabilities," Journal of Political Economy, 1973, V83, 637-654.
Chen, Ren-Raw, and Louis Scott, "Pricing Interest Rate Futures Options with Futures-Style Margining," Journal of Futures Markets, 1993, v13(1), 15-22.
Coats, Pamela K., and L. Franklin Fant, "Recognizing Financial Distress Patterns Using a Neural Network Tool," Financial Management, 1993, v22(3), 142-155.
Collins, E., S. Ghosh and C. Scofield, "An Application of a Multiple Neural Network Learning System to Emulation of Mortgage Underwriting Judgements," Proceedings of the IEEE International Conference on Neural Networks, 1988, 2, 459-466.
Cox, J., Ross, S. and M. Rubenstein, "Option Pricing: A Simplified Approach," Journal of Financial Economics, 1979, V7, 229-263.
Dorsey, R.E., R.O. Edmister and J.D. Johnson, "Financial Distress Prediction: A Neural Net Model for Large Corporations," Working Paper, Department of Economics and Finance, University of Mississippi, 1993.
Dorsey, R.E., J.D. Johnson and W.J. Mayer, "A Genetic Algorithm for the Training of Feedforward Neural Networks," Advances in Artificial Intelligence in Economics, Finance, and Management, (J.D. Johnson and A.B. Whinston, eds., pp. 93-111), Vol. I, 1994, Greenwich, CT: JAI Press Inc.
Dorsey, R.E. and W.J. Mayer, "Genetic Algorithms for Estimation Problems with Multiple Optima, Non-Differentiability, and Other Irregular Features," Journal of Business and Economic Statistics, Vol. 13, No. 1, pp. 53-66.
________________, "Optimization Using Genetic Algorithms," Advances in Artificial Intelligence in Economics, Finance, and Management, (J.D. Johnson and A.B. Whinston, eds.), Vol. I, Greenwich, CT: JAI Press Inc., pp. 69-91.
Dutta, S., and S. Shekhar, "Bond Rating: A Non-conservative Application of Neural Networks," Proceedings of the IEEE International Conference on Neural Networks, 2 1988, 443-450.
Hatfield, Gay B., and A. Jay White, "Bond Rating Changes: Neural Net Estimates for Bank Holding Companies," 1996, Working Paper.
.
Hornik, K., M. Stinchcombe and H. White, "Multilayer Feedforward Networks are Universal Approximators," Technical Report 88-45R, 1989, University of California at San Diego.
_____________, "Universal Approximation of an Unknown Mapping and its Derivatives Using Multilayer Feedforward Networks," Technical Report 89-36R, 1990, University of California at San Diego.
Hull, John, Options, Futures, and Other Derivative Securities, 2nd Ed., Prentice-Hall, 1993, 190-244, 329-411, and 434-452.
Hull, J. C., and Alan White, "The Pricing of Options on Assets with Stochastic Volatilities," Journal of Finance, 42 (June 1987), 281-300.
_____________, "An Analysis of the Bias in Option Pricing caused by a Stochastic Volatility," Advances in Futures and Options Research, 3 (1988), 27-61.
_____________, "Pricing Interest Rate Derivative Securities," Review of Financial Studies, 1990, 3(4), 573-92.
Keen, John, "Genetic Algorithms for Market Trading," AI in Finance, Winter 1995, 25-9.
Lieu, Derming. "Option Pricing With Futures-Style Margining," Journal of Futures Markets, 1990, v10(4), 327-338..
Rendleman, R., Jr., and B. Bartter, "Two-State Option Pricing," Journal of Finance, December 1979, 1093-1110.
Rumelhart, D. E., G.G. Hinton and R. J. Williams, "Learning Internal Representations by Error Propagation", Parallel Distributed Processing: Exploration in the Microstructure of Cognition, Vol. I, D.E. Rumelhart and J.L. McClelland (Eds.), MIT Press: Mass., 1986, 318-62.
Salchenberger, L. M., E. M. Cinar and N. A. Lash, "Neural Networks: A New Tool for Predicting Thrift Failures," Decision Sciences, 23, 1992, 899-916.
Sexton, R., J. Johnson and R. Dorsey, "Obtaining a Global Optimum for Neural Networks," Working Paper, Department of Economics and Finance, University of Mississippi, 1995.
Shandle, J., "Neural Networks are Ready for Prime Time," Electronic Design, February 18, 1993, 51-58.
White, H., "An Additional Hidden Unit Test for Neglected Nonlinearity in Multi-layer Feedforward Networks," Proceedings of the International Joint Conference on Neural Networks, 1989, Washington D.C.
_____________, "Neural-Network," AI Expert, December 1989, pp. 48-52.