ASSOCIATIVE NETWORKS FOR THE DAILY PREDICTION OF OPENING AND CLOSING VALUES OF STOCK INDICES
Redes asociativas para la predicción diaria de los valores de apertura y cierre de los índices bursátiles
Simón Pedro J. Mejía Uribe 1, Henry Laniado Roda 2, Paula María
Almonacid Hurtado 3 y Jimmy Saravia Matus 4
Recibido: 1/5/2023 - Aceptado: 28/12/2023
ABSTRACT
This paper presents an associative neural network based on Long Short-Term Memory (LSTM) networks to predict the opening, minimum, maximum and closing prices of the Shanghai composite index, PetroChina Company Limited (PetroChina), and Zhongxing Telecommunications Equipment Corporation (ZTE). The data is transformed with time series techniques to render them stationary. Once good results are obtained in terms of the mean absolute percentage error (MAPE), the model is tested with the American Nasdaq Composite Index (IXIC). Similar works have been carried out, such as that of Ding & Qin (2020) where they predict the opening, minimum and maximum prices of an asset. This study goes a step further to predict the closing value following the proposed associative network methodology. Having the opening price and the closing price, it is possible to make investments to generate profitability based on the daily net change in value of the asset.
RESUMEN:
Este trabajo presenta una red neuronal asociativa basada en redes LSTM (Long Short-Term Memory) para predecir los precios de apertura, mínimo, máximo y cierre del índice compuesto de Shanghai, PetroChina y Zhongxing Telecommunications Equipment Corporation (ZTE). Los datos son transformados con técnicas de series temporales para hacerlos estacionarios. Una vez obtenidos buenos resultados en términos del error porcentual absoluto medio (MAPE), el modelo es probado con el American Nasdaq Composite Index (IXIC). Trabajos similares han sido realizados, como el de Ding & Qin (2020), donde predicen los precios de apertura, mínimos y máximos de un activo. Este studio va un paso más adelante en la predicción del valor de cierre siguiendo la metodología de redes asociativas propuesta. Teniendo el precio de apertura y el precio de cierre, es posible realizar inversiones para generar rentabilidad en base al cambio neto diario en valor del activo.
PALABRAS CLAVE: red neuronal asociativa, LSTM, series temporales, índice, precios
KEYWORDS: Associative neural network, LSTM, Time series, Index, prices
INTRODUCTIÓN
Predicting stock and financial indices prices is considered a complex problem given the amount of noise in the system, that is, it has small or large fluctuations that cannot be explained with the data available. This can be due to multiple factors that affect the value of financial assets such as news, a change in the laws of a country, a comment on social networks or it can simply be white noise which cannot be tracked. In most recent works there are different approaches to address this problem, some authors seek to build more complex models such as Ding & Qin (2020) with its associative network that predicts the opening, minimum and maximum values for an asset for the following day, others seek solve the problem by obtaining data that better explains the system as Mehta et al. (2021) who integrated data from public opinion, sentiment analysis, and historical data to train multiple models, some researchers prefer to attack the noise problem directly, such as Wu et al. (2021) or Bao et al. (2017) who used Wavelet transformations to reduce noise in the system.
The ability to correctly predict the behavior of the value of assets leads to profits for investors while reducing risk. Therefore, in this work a technique is designed that allows predicting the following day opening, minimum, maximum and closing values of a stock or financial index simultaneously, using associative LSTM neural networks which allow for predicting values simultaneously by taking advantage of the relationship that exists between asset prices. Similar works have been carried out, such as that of Ding & Qin (2020) where the opening, minimum and maximum values of an asset are predicted. This study goes a step further by predicting the closing value using the proposed associative network methodology. With the predictions of opening and closing values for the following day, it is possible to make investments that seek to profit from the daily net change in the value of the asset.
This study begins testing by using data from the Shanghai composite index, PetroChina and ZTE. These assets were selected in order to replicate the results obtained by Ding & Qin (2020). The characteristics used are money traded, net asset value change and the opening, minimum, maximum and closing values for each asset. The data is collected with a daily frequency.
As can be expected, the opening, minimum, maximum and closing values are found to be related to each other, therefore the LSTM associative network technique is an interesting approach to this problem. This document validates the efficiency of LSTM associative networks to predict related historical values and extend good performance to indices from other market sectors.
The rest of this paper is organized as follows. In the second section we present the state of the art and our theoretical framework, here we discuss related works and the different techniques used in the literature. In the third section, we introduce the methodology, present the data and explain analytical models for the construction of the network. In the fourth section we present our results. Finally, in the last section we present our conclusions.
LITERATURE REVIEW
There are many related works that use different techniques and approaches to predict the future value of stocks. A general 3 by 3 forecasting model has been used to classify stocks into 3 categories: overvalued stocks, undervalued stocks, and reasonably priced stocks (Harel & Harpaz, 2021). Using this technique investors can build strategies around the current state of stocks. The classification of price behavior for the following day into “High” or “Low” categories, corresponding to a rise or fall in price, required the use of a decision tree which was trained using articles published worldwide in order to identify the words that most affect the market. The model was tested following this strategy and proved to have superior results to other recently studied techniques (Carta et al., 2021). The efficiency of different models when trained with continuous or discrete data was compared, better results were obtained using the RNN and LSTM models when trained with continuous data (Nabipour et al., 2020).
Using data from the COVID-19 pandemic from social networks where cases of infections and deaths are reported, measurements of sentiment and events that affect the stock market have been extracted. This information has been used as input for SVM and Random Forest Classification models to predict market movements for the next day as shown in Almehmadi, (2021). The latter study managed to obtain superior results by including data from the pandemic. Combining financial data from stocks together with data extracted from Twitter and testing different combinations of these, the random forest model, despite requiring 37 characteristics, offered better results than other models such as the K-Nearest Neighbors that uses 9 characteristics to give its best result (Evans et al., 2021).
It has been shown that it is possible to make more abstract representations of the data using deep learning tools, deep canonical correlation analysis, and deep canonically correlated autoencoders. Results show that these models surpass traditional statistical and machine learning techniques (Huang et al., 2021). The relevance of the variables is identified by combining the results of Logistic Regression, SVM and Random Forest. The variables identified as relevant are used as input in a deep generative model which shows to surpass other approaches in the state of the art (Haq et al., 2021).
Wu et al., (2021) show that Wavelet Transformation (DWT) for noise removal and Extreme Learning Machine as a learning algorithm permits them to obtain better results than other Machine Learning methods. Bao et al., 2017 build a model that reduces data noise and improves prediction accuracy and gains when put into practice by combining Wavelet Transformations with Stacked Autoencoders and LSTM.
Mehta et al., 2021 use SVM models, MNB Classifier, Linear Regression, Naive Bayes and LSTM Networks models, to analyze data from public opinions, sentiment analysis and historical stock data to make predictions. The results show that considering these sources improves predictions. Nti et al., (2021) extract data from multiple sources and randomly select characteristics to train a hybrid CNN-LSTM network model and present high accuracy results, validating the efficiency of taking data from multiple sources. Demi̇rel et al., (2020), compare Multilayer Perceptron (MLP), SVM and LSTM models as to their ability to predict daily opening and closing prices, their results show that the MLP and LSTM models are superior to the SVM model. Ding & Qin, (2020), build an associative LSTM network which simultaneously predicts the opening, maximum and minimum values of with an accuracy better than 95%.
Javed Awan et al., (2021) use big data techniques and structured and unstructured data from social networks to compare linear regression, random forest and decision tree models. Using large volumes of data, they identify that decision trees underperform compared to other models.
Zhang & Bai, (2019) build and compare several models to identify stocks that have the highest probability of rising in price. Random Forest obtained the best results in prediction accuracy and in generating higher annual returns. Khattak et al., (2019) use KNN to classify stock trends and in this way identify future price increases or decreases. Their results show that their proposed model outperforms other simpler models.
Wang & Bai, (2018) generate weak ANN models and demonstrate that it is possible to improve their predictions by creating assemblies of these models using boosting techniques. Wu et al., (2018) show that it is possible to effectively filter the noise in market characteristics by assembling weak classifiers using AdaBoost, thereby obtaining better prediction results. Kohli et al., (2019) integrate historical stock data along with historical prices of commodities such as gold, oil, silver among others, and show that an AdaBoost ensemble model provides better results than other techniques used.
Nti et al., (2020) compare multiple assembly techniques for different models in order to predict share prices. Their results show that Stacking and Blending assemblies demonstrated superior performance compared that of Bagging and Boosting assemblies.
After reviewing related works in the literature, we decided to use a deep learning approach. Deep learning is a subset of artificial intelligence which uses neural networks that learn from large volumes of data. For this study, we use LSTM associative neural networks which, due to their ability to store information over time, are capable of relating events and finding trends that other types of networks cannot.
As a basis for the development of this paper we use the article "Study on the prediction of stock price based on the associated network model of LSTM" by Ding and Qin (2020). In their paper Ding and Qin build an associative network of great predictive performance, and use it to predict the opening, minimum and maximum prices of different stocks and financial indices.
METHODOLOGY
Data
We collect data for the Shanghai composite index (Code 000001.SS), PetroChina (601857.SS) and ZTE (000063SZ). The codes correspond to the identification number used in the Shanghai stock exchange. We construct one dataset for each asset. The datasets contain daily data for the following variables:
• Open: Opening price.
• Low: Minimum price.
• High: Maximum price.
• Close: Closing Price.
• Change: change in the price of the asset.
• Money: Money traded during the day.
Table 1 shows an example of the data for the Shanghai composite index.
Tabla 1 - Small sample of the Shanghai composite index
The data are stationarized by subtracting from each value the value of the previous day, thus removing the trend and ensuring that the model learns deeper patterns in the data.
For the purpose of parameter selection, we divide the data in two parts. One part is used for training and the other for testing the model. This allows us to verify that the model is generalizing and works correctly for data that was not used during training.
To evaluate the performance of the model as an investment system, we remove the data corresponding to the last 20 days and then we use the model to predict these values. For each day, we evaluate the prediction of the model regarding the direction of asset price changes. A positive movement in the price of an asset is predicted when the model indicates that the daily closing price will be higher than the opening price, otherwise we interpret the that a negative movement in the price of the asset has occurred. We then compare the real direction with that predicted by the the model. If the directions are the same, this represents a profit and the magnitude of which will be given by the real change in the price of the asset, otherwise the result is a loss. Finally, we add the gains and losses for the last 20 days to obtain a 20-day return, which is equivalent to a monthly return.
Model
Long Short-Term Memory (LSTM) networks are a type of recurrent neural network with the ability to store information from past events to predict sequencing problems. Each neuron has 3 gates: input gate, forget gate, and output gate.
We use the notation introduced by Ding and Qin (2020) in the development of the formulas.
Equations 1 and 2 show the activation functions used to update the state of the neuron. Equation 1 is the sigmoid function and equation 2 the tanh function.
The forget gate determines what information can be discarded. We use equation 3 to calculate the probability of forgetting.
Where ht-1 represents the output of the previous neurons. xt represents the input to the current neuron. Wf and bf represent the weights and the basis vectors, respectively.
The Input Gate is responsible for updating and feeding the information in the neuron. It considers the new information and the previous information and is calculated with equations 4, 5 and 6.
Finally, the output gate takes the current state Ct and filters it to obtain the final result of the neuron as shown in equations 7 and 8.
To improve the interpretation power of the model and reduce overfitting, the Dropout regularization technique is used. With this technique, in each model training cycle there is a probability that each neuron will disconnect from the network. In this way the neurons do not generate a codependency with each other, which reduces overfitting.
Since the opening, minimum, maximum and closing prices of a stock are associated, an associative network based on LSTM networks is built that predicts these values simultaneously. The associative network is built from the LSTM base structure where each branch of the network predicts one of the values of interest.
As shown in equation 9, the mean square error (MSE) is used as the loss function for each branch. The lower the MSE, the better the accuracy.
In equation 9, the loss in each of the network outputs is calculated, where yi and y'i correspond to the actual values and predicted values, respectively, and n is the number of records.
Equation 10 shows that the average of the loss functions of each branch is used as the cost function for the complete model.
Using this methodology, we will estimate the opening, minimum, maximum and closing values for the 3 datasets proposed in the data section. Afterwards, we will use the methodology to predict these values for other markets.
Parameter selection
For LSTM networks there is a time steps parameter that determines the number of previous time periods the network considers to predict the next value. For the selection of this parameter and the number of epochs, the network is evaluated at 5, 10, and 20 timesteps for 500 epochs for the 3 assets.
Tabla 2 - Cost functions
Table 2 shows that a network with generalized parameters for all assets cannot be considered. For 000001.SS and 000063.SZ the cost function has better results with 5 timesteps, while for 601857.SS better results are obtained with 10 timesteps. For this reason, we decided to continue the tests maintaining the 5, 10 and 20 timesteps.
Graphs 1, 2 and 3 show the training of the model at 500 epochs for the different timesteps. We decided to train the model at 250 timesteps where we find a good relationship between convergence and training time.
Figura 1. Five timesteps training
Figura 2. Ten timesteps training
Figura 3. Ten timesteps training
RESULTS
To the previously evaluated assets, we now add the Nasdaq Composite (^IXIC), which is one of the most relevant financial indices in the American stock market. Using this index, we seek to evaluate the behavior of our network for the case of an American financial asset.
We use the Mean Absolute Percentage Error (MAPE) metric to obtain a value that represents the network error in percentage terms. Since multiple values are predicted (open, minimum, maximum and close), multiple MAPE values are simultaneously obtained, which are averaged to obtain the overall MAPE of the network.
Tabla 3 - MAPE
Table 3 shows the overall MAPE of the network for the different assets. Again, we find that it is not possible to have a network with fixed parameters that generates best results for all assets.
The MAPE represents how accurate the model's prediction was with respect to the real values for the asset. However, good MAPE results do not necessarily guarantee that a model is capable of generating positive returns. For this reason, we now evaluate the theoretical returns of the model following the procedure outlined in the methodology.
Tabla 4 - Theoretical returns
Tabla 5 - Real returns
Table 4 shows interesting positive and negative returns such as 16.01% for asset 000063.SZ at 10 timesteps and -4,61% for 000001.SS at 10 and 20 timesteps. As a point of comparison, Table 5 shows the real profitability of these assets during the same period of time. It can be seen that in all cases it is more profitable to rely on the predictions obtained by the network than the strategy of buying and holding the assets. Even for 000001.SS, where the network generates a negative return, we obtain a smaller loss than holding the asset.
CONCLUSIONS
We find that it is possible to build a highly accurate model using the methodology proposed in this article. This is evidenced by prediction percentage errors which are inferior to 0.5% for the values of interest for each asset. We also find in the results section that we do not have a single model capable of making accurate predictions for all assets, so that we find it necessary to readjust the parameters for each asset.
In all the cases we evaluate, the model exceeds the baseline investment strategy of buying and holding the asset. Even in the case where the model generates losses, these are lower than those generated by the baseline strategy. Theoretical returns show that the model could generate returns of up to 16% in one month.
REFERENCES
• Almehmadi, A. (2021). COVID-19 pandemic data predict the stock market. Computer Systems Science and Engineering, 36(3), 451–460.
• Bao, W., Yue, J., & Rao, Y. (2017). A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PloS One, 12(7), e0180944.
• Carta, S. M., Consoli, S., Piras, L., Podda, A. S., & Recupero, D. R. (2021). Explainable machine learning exploiting news and domain-specific lexicon for stock market forecasting. IEEE Access: Practical Innovations, Open Solutions, 9, 30193–30205.
• Demi̇rel, U., Çam, H., & Ünlü, R. (2020). Predicting stock prices using machine learning methods and deep learning algorithms: The sample of the Istanbul Stock Exchange. GAZI UNIVERSITY JOURNAL OF SCIENCE. https://doi.org/10.35378/gujs.679103
• Ding, G., & Qin, L. (2020). Study on the prediction of stock price based on the associated network model of LSTM. International Journal of Machine Learning and Cybernetics, 11(6), 1307–1317.
• Evans, L., Owda, M., Crockett, K., & Fernandez Vilas, A. (2021). Credibility assessment of financial stock tweets. Expert Systems with Applications, 168(114351), 114351.
• Haq, A. U., Zeb, A., Lei, Z., & Zhang, D. (2021). Forecasting daily stock trend using multi-filter feature selection and deep learning. Expert Systems with Applications, 168(114444), 114444.
• Harel, A., & Harpaz, G. (2021). Forecasting stock prices. International Review of Economics & Finance, 73, 249–256.
• Huang, S.-C., Wu, C.-F., Chiou, C.-C., & Lin, M.-C. (2021). Intelligent FinTech data mining by advanced deep learning approaches. Computational Economics. https://doi.org/10.1007/s10614-021-10118-5
• Javed Awan, M., Shafry Mohd Rahim, M., Nobanee, H., Munawar, A., Yasin, A., & Mohd Zain Azlanmz, A. (2021). Social media and stock market prediction: A big data approach. Computers, Materials & Continua, 67(2), 2569–2583.
• Khattak, A. M., Ullah, H., Khalid, H. A., Habib, A., Asghar, M. Z., & Kundi, F. M. (2019). Stock market trend prediction using supervised learning. Proceedings of the Tenth International Symposium on Information and Communication Technology - SoICT 2019. the Tenth International Symposium, Hanoi, Ha Long Bay, Viet Nam. https://doi.org/10.1145/3368926.3369680
• Kohli, P. P. S., Zargar, S., Arora, S., & Gupta, P. (2019). Stock prediction using machine learning algorithms. In Advances in Intelligent Systems and Computing (pp. 405–414). Springer Singapore.
• Mehta, P., Pandya, S., & Kotecha, K. (2021). Harvesting social media sentiment analysis to enhance stock market prediction using deep learning. PeerJ. Computer Science, 7, e476.
• Nabipour, M., Nayyeri, P., Jabani, H., Shahab, & Mosavi, A. (2020). Predicting stock market trends using machine learning and deep learning algorithms via continuous and binary data; A comparative analysis. IEEE Access: Practical Innovations, Open Solutions, 8, 150199–150212.
• Nti, I. K., Adekoya, A. F., & Weyori, B. A. (2020). A comprehensive evaluation of ensemble learning for stock-market prediction. Journal of Big Data, 7(1). https://doi.org/10.1186/s40537-020-00299-5
• Nti, I. K., Adekoya, A. F., & Weyori, B. A. (2021). A novel multi-source information-fusion predictive framework based on deep neural networks for accuracy enhancement in stock market prediction. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-020-00400-y
• Wang, C., & Bai, X. (2018). Boosting learning algorithm for stock price forecasting. IOP Conference Series. Materials Science and Engineering, 322, 052053.
• Wu, D., Wang, X., & Wu, S. (2021). A Hybrid Method Based on Extreme Learning Machine and Wavelet Transform Denoising for Stock Prediction. Entropy , 23(4). https://doi.org/10.3390/e23040440
• Wu, Y.-P., Mao, J.-M., & Li, W.-F. (2018). Predication of Futures Market by Using Boosting Algorithm. In 2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET). https://doi.org/10.1109/wispnet.2018.8538586
• Zhang, C., & Bai, Y. (2019, December). Chinese A share stock ranking with machine learning apporach. 2019 6th International Conference on Information Science and Control Engineering (ICISCE). 2019 6th International Conference on Information Science and Control Engineering (ICISCE), Shanghai, China. https://doi.org/10.1109/icisce48695.2019.00047