Christoph Freudenthaler
Doctoral Student in the PhD program 01.06.2012  31.12.2012. Currently working as Algorithm and Product developer at Robart GmbH , Linz, Austria. advisorsorganisational data

project description
Factorization models, in particular Singular Value Decomposition, are an important and wellknown technique for multivariate data analysis. The general principle is to represent data as multilinear combinations of a smaller set of latent variables. This latent variable representation can be used in various contexts. Among these, it is extensively used for data compression, data visualization, and dimensionality reduction by leveraging the fact that the set of latent variables is small. Additionally, it is also popular for improving the predictive performance of regression models. In my PhD thesis if have been studying factorization in various ways and was successful in understanding and extending existing factorization models in terms of model statement and inference algorithms. During my stay with the Graduiertenkolleg, I was working on several subsequent research questions related to factorization models.The first one was the improvement of prediction quality of factorization models when they are used to predict rankings, i.e. sorted lists, of items. For that, the existing stochastic gradient based learning algorithm which is based on uniform sampling of training cases has been extended to nonuniform sampling. Together with Prof. Rendle, we proposed to adapt the distribution of sampling training cases to the target criterion that the stochastic gradient algorithm aims to optimize. We were able to show that a general nonuniform training case sampling strategy has two advantages over the uniform sampling: on the one hand the learning of the factorization model's parameters converges faster. On the other hand also the predictive performance benefits. These findings are to be published and currently under review for the Web Search and Data Mining Conference 2014.
A second topic, we have been working on was the fast generation of ranking predictions for an already estimated factorization model. One problem of recommender systems which try to suggest interesting items to users is that there is a huge amount of items out of which a tiny subset of really interesting items needs to be selected and ranked according to the preferences of the requesting user. The bruteforce approch of computing the ranking completely is in such scenarios infeasible if this has to be done for many different users simultaneously as is the case in the common deployment of recommender systems in online shops. To overcome this scalability problem, we developed a faster selection of really interesting items by truncating the computation of a factorization model's generated ranking scores.
publications
The following list of publications covers only those, which are or were published during participation at the Graduiertenkolleg / PhD program.
id should be a number