University of Konstanz
Graduiertenkolleg / PhD Program
Computer and Information Science

Colloquium of the Department and the PhD Program

title

Peer-to-peer web search with Minerva

speaker

Prof. Dr. Gerhard Weikum, Max-Planck Institute for Informatics, Saarbrücken, Germany
Saarbrücken, Germany

date & place

Thursday, 01.02.2007, 16:15 h
Room G421

abstract

The peer-to-peer (P2P) computing paradigm is an intriguing alternative to Google-style search engines for querying and ranking Web content. In a network with many thousands or millions of peers the storage and access load requirements per peer are much lighter than for a centralized server farm; thus more powerful techniques from information retrieval, statistical learning, computational linguistics, and ontological reasoning can be employed on each peer's local search engine for boosting the quality of search results. In addition, peers can dynamically collaborate on advanced and particularly difficult queries. Moroever, a peer-to-peer setting is ideally suited to capture local user behavior, like query logs and click streams, and disseminate and aggregate this information in the network, at the discretion of the corresponding user, in order to incorporate richer cognitive models.

On the other hand, P2P Web search also poses major challenges, one of them being the computation, dissemination, and efficient management of statistical measures that are crucial for good search strategies and ranking algorithms. Statistics (e.g., local and global document frequencies, overlap among peers' contents, PageRank-style authority) need to be acquired and maintained in a decentralized manner for scalability, they need to be compact for efficient communication, and they need to provide sufficiently accurate estimators of various measures of interest. This talk will give an overview on our ongoing research on P2P Web search, with emphasis on statistics-driven query routing, decentralized PageRank computation, and exploitation of user behavior. The developed methods have been implemented in the Minerva prototype system, an experimental testbed for P2P research.