University of Konstanz
Graduiertenkolleg / PhD Program
Computer and Information Science

Graduation Talks

title

Processing and Visualizing large XML Instances

speaker

Christian Grün, University Konstanz
Konstanz, Germany

date & place

Wednesday, 09.07.2008, 16:15 h
Room C252

abstract

"XML Documents are small", "XML fits into main memory": these statements were valid 10 years ago when most documents were limited to single data or configuration files. Nowadays, very large XML instances exist, such as the Wikipedia corpus or biological and medical data, and it seems obvious to use databases to handle data of this order of magnitude. At the same time, XML is still emerging, and many new standards are coming into existence, such as the XQuery 1.0 language specification which has been finalized as recently as January 2008. Our research is centered around the following questions:
- How can we store the tree structure of XML in the range of several gigabytes?
- Can we efficiently perform structure and content- based queries?
- How can large XML documents be visualized and interacted with in real-time?
BaseX, an open-source database, incorporates our research efforts and was implemented to allow realistic performance comparisons with other databases.