Acta Univ. Agric. Silvic. Mendelianae Brun. 2018, 66(6), 1573-1580 | DOI: 10.11118/actaun201866061573
Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market
- Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech Republic
Each day, a lot of text data is generated. This data comes from various sources and may contain valuable information. In this article, we use text mining methods to discover if there is a connection between news articles and changes of the S&P 500 stock index. The index values and documents were divided into time windows according to the direction of the index value changes. We achieved a classification accuracy of 65-74 %.
Keywords: machine learning, text mining, stock market, data stream
Grants and funding:
This research was supported by the Czech Science Foundation [grant No. 16-26353S "Sentiment and its Impact on Stock Markets"] and Internal Grant Agency of Mendel University [No. PEF_DP_2018002 "Knowledge mining in continuous textual sources with a changing concept"] and Internal Grant Agency of Mendel University [No. PEF_DP_2018016 "Text analysis by machine learning with a focus on the stock market"].
Published: December 19, 2018 Show citation
ACS | AIP | APA | ASA | Harvard | Chicago | IEEE | ISO690 | MLA | NLM | Turabian | Vancouver |
References
- AGGARWAL, C. C. 2007. Data Streams: Models and Algorithms. Springer.
Go to original source...
- BOLLEN, J., MAO, H. and ZENG, X. 2011. Twitter mood predicts the stock market. Journal of Computational Science, 2(1): 1-8. DOI: 10.1016/j.jocs.2010.12.007
Go to original source...
- CHANG, P. C. and LIU, C. H. 2008. A TSK type fuzzy rule based system for stock price prediction. Expert Systems with applications, 34(1): 135-144. DOI: 10.1016/j.eswa.2006.08.020
Go to original source...
- DARENA F., PETROVSKY J., ZIZKA, J. and PRICHYSTAL, J. 2018. Machine Learning-Based Analysis of the Association between Online Texts and Stock Price Movements. Inteligencia Artificial, 21(61): 95-110. DOI: 10.4114/intartif.vol21iss61pp95-110
Go to original source...
- DENG, S., MITSUBUCHI, T., SHIODA, K., SHIMADA, T. and SAKURAI, A. 2011. December. Combining technical analysis with sentiment analysis for stock price prediction. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing. IEEE, pp. 800-807.
Go to original source...
- DI PERSIO, L. and HONCHAR, O. 2016. Artificial Neural Networks architectures for stock price prediction: comparisons and applications. International Journal of Circuits, Systems and Signal Processing, 10: 403-413.
- FALINOUSS, P. 2007. Stock Trend Prediction Using News Articles A Text Mining Approach. Tarbiat Modares University.
- GAMA, J. 2010. Knowledge discovery from data streams. CRC Press.
Go to original source...
- GIOT, P. 2005. Market risk models for intraday data. The European Journal of Finance, 11(4): 309-324. DOI: 10.1080/1351847032000143396
Go to original source...
- GO, A., BHAYANI, R. and HUANG, L. 2009. Twitter sentiment classification using distant supervision. CS224N Project Report. Stanford.
- GROTH, S. S. and MUNTERMANN, J. 2011. An intraday market risk management approach based on textual analysis. Decision Support Systems, 50(4):680-691. DOI: 10.1016/j.dss.2010.08.019
Go to original source...
- GUHA, S., KIM, C., SHIM, K. GUHA, S., KIM, C. and SHIM, K. 2004. XWAVE: optimal and approximate extended wavelets. In: Proceedings of the thirtieth international conference on very large data bases. Vol 30. Toronto, Canada, August 31 - September 03, 2004, pp. 288-299.
Go to original source...
- HAFEZI, R., SHAHRABI, J. and HADAVANDI, E., 2015. A bat-neural network multi-agent system (BNNMAS) for stock price prediction: Case study of DAX stock price. Applied Soft Computing, 29: 196-210. DOI: 10.1016/j.asoc.2014.12.028
Go to original source...
- HAGENAU, M., LIEBMANN, M. and NEUMANN, D., 2013. Automated news reading: Stock price prediction based on financial news using context-capturing features. Decision Support Systems, 55(3): 685-697. DOI: 10.1016/j.dss.2013.02.006
Go to original source...
- HAN, J., KAMBER, M. and PEI, J. 2012. Data Mining: Concepts and Techniques. 3rd Edition. Waltham, MA: Morgan Kaufmann.
- HO, R. 2010. Map Reduce and Stream Processing. Pragmatic Programming Techniques. [Online]. Available at: http://horicky.blogspot.cz/2010/11/map-reduce-and-stream-processing.html [Accessed: 2018, October 17].
- HULTEN, G., SPENCER, L., and DOMINGOS, P. 2001. Mining time-changing data streams. In: ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining. New York: AMC, pp. 97-106.
Go to original source...
- KARGUPTA, H., PARK, B. H., PITTIE, S., LIU, L., KUSHRAJ, D., and SARKAR, K. 2002. MobiMine: Monitoring the Stock Market from a PDA. ACM SIGKDD Explorations, 3(2): 37-46. DOI: 10.1145/507515.507521
Go to original source...
- KEARNEY C. and LIU, S. 2014. Textual sentiment in finance: A survey of methods and models. International Review of Financial Analysis, 33: 171-185. DOI: 10.1016/j.irfa.2014.02.006
Go to original source...
- KOGAN, S., LEVIN, D., ROUTLEDGE, B. R., SAGI, J. S. and SMITH, N. A. 2009. Predicting risk from financial reports with regression. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 272-280.
Go to original source...
- LEE, H., SURDEANU, M., MACCARTNEY, B. and JURAFSKY, D. 2014. On the Importance of Text Analysis for Stock Price Prediction. In: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association, pp. 1170-1175.
- LIN, X., YANG, Z. and SONG, Y. 2009. Short-term stock price prediction based on echo state networks. Expert systems with applications, 36(3): 7313-7317. DOI: 10.1016/j.eswa.2008.09.049
Go to original source...
- LO, A. W. 2001. Risk management for hedge funds: introduction and overview. Financial Analysts Journal, 57(6): 16-33. DOI: 10.2469/faj.v57.n6.2490
Go to original source...
- MANYIKA, J., CHUI, M., BROWN, B., BUGHIN, J., DOBBS, R., ROXBURGH, C., and BYERS, A. H. 2011. Big Data: The Next Frontier for Innovation, Competition, and Productivity. Report. McKinsey Global Institute.
- MITTERMAYER, M. A. 2004. Forecasting intraday stock price trends with text mining techniques. In: HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04). Volume 3. Washington, DC: IEEE.
Go to original source...
- NASSIRTOUSSI, A. K., AGHABOZORGI, S., WAH, T. Y., and NGO, D. C. L. 2014. Text mining for market prediction: A systematic review. Expert Systems with Applications, 41(16): 7653-7670 DOI: 10.1016/j.eswa.2014.06.009
Go to original source...
- NETOLICKÝ, P., PETROVSKÝ, J., DAŘENA, F. and ŽIŽKA, J. 2017. Text Classification Using Time Windows Applied to Stock Exchange. International Journal of New Computer Architectures and their Applications, 7(2): 62-67.
Go to original source...
- SCHUMAKER, R. P. and CHEN, H., 2009. Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems, 27(2): 12. DOI: 10.1145/1462198.1462204
Go to original source...
- WANG, B., HUANG, H. and WANG, X., 2012. A novel text mining approach to financial time series forecasting. Neurocomputing, 83: 136-145. DOI: 10.1016/j.neucom.2011.12.013
Go to original source...
- WANG, Y. 2017. Stock market forecasting with financial micro-blog based on sentiment and time series analysis. Journal of Shanghai Jiaotong University (Science), 22(2): 173-179. DOI: 10.1007/s12204-017-1818-4
Go to original source...
- WEISS, S. M., INDURKHYA, N. and ZHANG, T. 2010. Fundamentals of Predictive Text Mining. London: Springer.
Go to original source...
- WUTHRICH, B., CHO, V., LEUNG, S., PERMUNETILLEKE, D., SANKARAN, K. and ZHANG, J. 1998. Daily stock market forecast from textual web data. In: SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics. Cat. No. 98CH36218. Vol. 3. IEEE, pp. 2720-2725.
This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY NC ND 4.0), which permits non-comercial use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.