Acta Univ. Agric. Silvic. Mendelianae Brun. 2018, 66(6), 1573-1580 | DOI: 10.11118/actaun201866061573

Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market

Pavel Netolický, Jonáš Petrovský, František Dařena
Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech Republic

Each day, a lot of text data is generated. This data comes from various sources and may contain valuable information. In this article, we use text mining methods to discover if there is a connection between news articles and changes of the S&P 500 stock index. The index values and documents were divided into time windows according to the direction of the index value changes. We achieved a classification accuracy of 65-74 %.

Keywords: machine learning, text mining, stock market, data stream
Grants and funding:

This research was supported by the Czech Science Foundation [grant No. 16-26353S "Sentiment and its Impact on Stock Markets"] and Internal Grant Agency of Mendel University [No. PEF_DP_2018002 "Knowledge mining in continuous textual sources with a changing concept"] and Internal Grant Agency of Mendel University [No. PEF_DP_2018016 "Text analysis by machine learning with a focus on the stock market"].

Published: December 19, 2018  Show citation

ACS AIP APA ASA Harvard Chicago IEEE ISO690 MLA NLM Turabian Vancouver
Netolický, P., Petrovský, J., & Dařena, F. (2018). Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis66(6), 1573-1580. doi: 10.11118/actaun201866061573
Download citation

References

  1. AGGARWAL, C. C. 2007. Data Streams: Models and Algorithms. Springer. Go to original source...
  2. BOLLEN, J., MAO, H. and ZENG, X. 2011. Twitter mood predicts the stock market. Journal of Computational Science, 2(1): 1-8. DOI: 10.1016/j.jocs.2010.12.007 Go to original source...
  3. CHANG, P. C. and LIU, C. H. 2008. A TSK type fuzzy rule based system for stock price prediction. Expert Systems with applications, 34(1): 135-144. DOI: 10.1016/j.eswa.2006.08.020 Go to original source...
  4. DARENA F., PETROVSKY J., ZIZKA, J. and PRICHYSTAL, J. 2018. Machine Learning-Based Analysis of the Association between Online Texts and Stock Price Movements. Inteligencia Artificial, 21(61): 95-110. DOI: 10.4114/intartif.vol21iss61pp95-110 Go to original source...
  5. DENG, S., MITSUBUCHI, T., SHIODA, K., SHIMADA, T. and SAKURAI, A. 2011. December. Combining technical analysis with sentiment analysis for stock price prediction. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing. IEEE, pp. 800-807. Go to original source...
  6. DI PERSIO, L. and HONCHAR, O. 2016. Artificial Neural Networks architectures for stock price prediction: comparisons and applications. International Journal of Circuits, Systems and Signal Processing, 10: 403-413.
  7. FALINOUSS, P. 2007. Stock Trend Prediction Using News Articles A Text Mining Approach. Tarbiat Modares University.
  8. GAMA, J. 2010. Knowledge discovery from data streams. CRC Press. Go to original source...
  9. GIOT, P. 2005. Market risk models for intraday data. The European Journal of Finance, 11(4): 309-324. DOI: 10.1080/1351847032000143396 Go to original source...
  10. GO, A., BHAYANI, R. and HUANG, L. 2009. Twitter sentiment classification using distant supervision. CS224N Project Report. Stanford.
  11. GROTH, S. S. and MUNTERMANN, J. 2011. An intraday market risk management approach based on textual analysis. Decision Support Systems, 50(4):680-691. DOI: 10.1016/j.dss.2010.08.019 Go to original source...
  12. GUHA, S., KIM, C., SHIM, K. GUHA, S., KIM, C. and SHIM, K. 2004. XWAVE: optimal and approximate extended wavelets. In: Proceedings of the thirtieth international conference on very large data bases. Vol 30. Toronto, Canada, August 31 - September 03, 2004, pp. 288-299. Go to original source...
  13. HAFEZI, R., SHAHRABI, J. and HADAVANDI, E., 2015. A bat-neural network multi-agent system (BNNMAS) for stock price prediction: Case study of DAX stock price. Applied Soft Computing, 29: 196-210. DOI: 10.1016/j.asoc.2014.12.028 Go to original source...
  14. HAGENAU, M., LIEBMANN, M. and NEUMANN, D., 2013. Automated news reading: Stock price prediction based on financial news using context-capturing features. Decision Support Systems, 55(3): 685-697. DOI: 10.1016/j.dss.2013.02.006 Go to original source...
  15. HAN, J., KAMBER, M. and PEI, J. 2012. Data Mining: Concepts and Techniques. 3rd Edition. Waltham, MA: Morgan Kaufmann.
  16. HO, R. 2010. Map Reduce and Stream Processing. Pragmatic Programming Techniques. [Online]. Available at: http://horicky.blogspot.cz/2010/11/map-reduce-and-stream-processing.html [Accessed: 2018, October 17].
  17. HULTEN, G., SPENCER, L., and DOMINGOS, P. 2001. Mining time-changing data streams. In: ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining. New York: AMC, pp. 97-106. Go to original source...
  18. KARGUPTA, H., PARK, B. H., PITTIE, S., LIU, L., KUSHRAJ, D., and SARKAR, K. 2002. MobiMine: Monitoring the Stock Market from a PDA. ACM SIGKDD Explorations, 3(2): 37-46. DOI: 10.1145/507515.507521 Go to original source...
  19. KEARNEY C. and LIU, S. 2014. Textual sentiment in finance: A survey of methods and models. International Review of Financial Analysis, 33: 171-185. DOI: 10.1016/j.irfa.2014.02.006 Go to original source...
  20. KOGAN, S., LEVIN, D., ROUTLEDGE, B. R., SAGI, J. S. and SMITH, N. A. 2009. Predicting risk from financial reports with regression. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 272-280. Go to original source...
  21. LEE, H., SURDEANU, M., MACCARTNEY, B. and JURAFSKY, D. 2014. On the Importance of Text Analysis for Stock Price Prediction. In: Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association, pp. 1170-1175.
  22. LIN, X., YANG, Z. and SONG, Y. 2009. Short-term stock price prediction based on echo state networks. Expert systems with applications, 36(3): 7313-7317. DOI: 10.1016/j.eswa.2008.09.049 Go to original source...
  23. LO, A. W. 2001. Risk management for hedge funds: introduction and overview. Financial Analysts Journal, 57(6): 16-33. DOI: 10.2469/faj.v57.n6.2490 Go to original source...
  24. MANYIKA, J., CHUI, M., BROWN, B., BUGHIN, J., DOBBS, R., ROXBURGH, C., and BYERS, A. H. 2011. Big Data: The Next Frontier for Innovation, Competition, and Productivity. Report. McKinsey Global Institute.
  25. MITTERMAYER, M. A. 2004. Forecasting intraday stock price trends with text mining techniques. In: HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04). Volume 3. Washington, DC: IEEE. Go to original source...
  26. NASSIRTOUSSI, A. K., AGHABOZORGI, S., WAH, T. Y., and NGO, D. C. L. 2014. Text mining for market prediction: A systematic review. Expert Systems with Applications, 41(16): 7653-7670 DOI: 10.1016/j.eswa.2014.06.009 Go to original source...
  27. NETOLICKÝ, P., PETROVSKÝ, J., DAŘENA, F. and ŽIŽKA, J. 2017. Text Classification Using Time Windows Applied to Stock Exchange. International Journal of New Computer Architectures and their Applications, 7(2): 62-67. Go to original source...
  28. SCHUMAKER, R. P. and CHEN, H., 2009. Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems, 27(2): 12. DOI: 10.1145/1462198.1462204 Go to original source...
  29. WANG, B., HUANG, H. and WANG, X., 2012. A novel text mining approach to financial time series forecasting. Neurocomputing, 83: 136-145. DOI: 10.1016/j.neucom.2011.12.013 Go to original source...
  30. WANG, Y. 2017. Stock market forecasting with financial micro-blog based on sentiment and time series analysis. Journal of Shanghai Jiaotong University (Science), 22(2): 173-179. DOI: 10.1007/s12204-017-1818-4 Go to original source...
  31. WEISS, S. M., INDURKHYA, N. and ZHANG, T. 2010. Fundamentals of Predictive Text Mining. London: Springer. Go to original source...
  32. WUTHRICH, B., CHO, V., LEUNG, S., PERMUNETILLEKE, D., SANKARAN, K. and ZHANG, J. 1998. Daily stock market forecast from textual web data. In: SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics. Cat. No. 98CH36218. Vol. 3. IEEE, pp. 2720-2725.

This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY NC ND 4.0), which permits non-comercial use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.