University of Konstanz
Graduiertenkolleg / PhD Program
Computer and Information Science

Graduation Talks


Visual Text and Time Series Analysis - Understanding the Impact of News on Financial Data


Franz Wanner, University Konstanz
Konstanz, Germany

date & place

Wednesday, 15.06.2011, 16:15 h
Room C 252


Economic researchers have been interested at all times in factors influencing stock trading. For this reason they use certain models to explain returns. Even after controlling for many risk factors, the unexplained variation in returns is substantial and many anomalies remain unexplained. Recent research shows that an automatically processed sentiment scoring of news messages can explain some residual returns of stocks.1 Sentiment analysis means "the computational treatment of opinion, sentiment, and subjectivity in text". The results of this research also show that the applied method is consistent with investor reactions.
Nevertheless, there were still movements on assets which could not be explained. The sentiment analysis methods used are straightforward and are not adapted to the domain of business and financial news. For the analysis which has been done so far, they used the General Inquirer (software for content analysis of textual data) web service. It applies English stemming procedures and integrates a sentiment dictionary. However, the General Inquirer was "developed for socialscience content-analysis research applications" and therefore does not suit the needs of financial news sentiment analysis. This fact is also reported by other researchers.4 They state that the Harvard sentiment lexicon, which is also an integral part of the General Inquirer, misclassifies commonly used words in financial texts. More sophisticated and domain specific sentiment analysis methods might help to increase the explanatory power for market variables and to understand the reaction of investors to textual news.
"News stories consist of many dimensions. Language, content, tone, sentiment and grammar are just a few." wrote Graf. Besides the improvement of the automatic sentiment processing, it is interesting to have a deeper look at the whole set or subsets of text features over time. Thereby, one main advantage of using economic models is they can help isolate areas of interest within the time series. This supports to focus on changes in the text feature space. Currently, we are able to extract more than 200 text features and each feature can be represented as a time series. The hypothesis: not only the sentiment gives an indication of abnormal asset movements but also other text features and combinations of features have an impact. An interesting question from a computer science perspective is the visualization of multiple feature time series and the patterns found.
Three main research goals have been identified so far: (1) The improvement of sentiment analysis in the financial domain; (2) The development of time series visual analysis methods to support pattern discovery over multiple time series; (3) The generation of new insights for economic researchers which can be used for modification of their models and as validation for the usefulness of the suggested methods.
This thesis was developed as part of a interdisciplinary collaboration with F. Graf, Department of Economics, Chair of International Finance, Prof. Dr. Franke.