ECML PKDD 2006 Workshop on Parallel Data Mining
Invited talk #2
Monitoring Distributed Data Streams
Prof. Assaf Schuster
Abstract: Monitoring data streams in a distributed system is the focus of much research in recent years. A distributed monitoring task consists of accurately detecting, at each point in time, whether the data complies with a certain global criteria. An example of a distributed monitoring task is using agents installed on a set of routers to detect when traffic to a certain IP address raises above a predetermined threshold. Another example of a distributed monitoring task is detecting when the average temperature reading taken by sensors in a sensor network exceeds a predetermined threshold.
Most of the proposed monitoring schemes deal with monitoring simple aggregated values. More involved challenges, such as the important task of feature selection (e.g., by monitoring the information gain of various features), or monitoring the variance in the temperature readings taken by sensors in a sensor network, still require very high communication overhead using naive, centralized algorithms.
We present a novel geometric approach by which an arbitrary global monitoring task can be split into a set of constraints applied locally on each of the streams. The constraints are used to locally filter out data increments that do not affect the monitoring outcome, thus avoiding unnecessary communication. As a result, our approach enables monitoring of arbitrary threshold functions over distributed data streams in an efficient manner.
The talk will also present some follow ups, as well as future directions that are enabled by this work.
Short Biography: Prof. Assaf Schuster is internationally recognized as an expert in the field of distributed computing, with over 140 publications to his name and vast experience with large industrial consortia, conferences and institutions.
In recent years, his research gradually focuses on large-scale distributed data processing and data mining. Since 1991 he has been with the Computer Science Department at the Technion (the Israel Institute of Technology), where he established and heads the Distributed Systems Laboratory (http://dsl.cs.technion.ac.il).