University of Konstanz
Graduiertenkolleg / PhD Program
Computer and Information Science

Graduation Talks

title

Visualizing Large Document Collections for Effcient Browsing and Recognition of Documents

speaker

Hendrik Strobelt, University Konstanz
Konstanz, Germany

date & place

Wednesday, 14.07.2010, 14:45 h
Room C 252

abstract

Large document collections are essential resources for a wide variety of professionals, like scientists, lawyers, analysts, etc. Traditionally, the tedious task to manage these (printed) document collections was solved by document curators (e.g. librarians). Through an increasing diversity of accessing documents and an increasing amount of available documents, the curating task has more and more been shifted to the document reader, who e.g. downloads documents from the WWW. An electronic document management system can assist the user in this task.
Research questions in creating a document management system range from the raw data level (searching, text mining, etc.) to human factors (perception, integration in human environments. In my thesis, I will investigate the visualization of large document collections from the scope of one document to collections of documents. In contrast to common approaches, I use a combination of textual and figure content as representatives for one document and subsets of collections (clusters). My research will cover information retrieval methods (text and image), layout methods (for different scopes), and interaction strategies (LOD strategies, Linkback,. . . ).