Please note that the venue of the conference has changed. DS-2008 will be taking place in Budapest, Hungary .
Discovery Science 2008 Logo
Call for Papers
Paper Submission
Accepted Papers
Important Dates
Awards
Program Committee
Program
Registration and Visa
Venue and Travel
Invited Talks
Tutorials
Sponsors
Links

Tutorials

Joao Gama
(University of Porto, Portugal)

Title: Mining from Data Streams: Issues and Challenges

Tutorial Summary

The Machine Learning community is faced to new challenges with the advent of sources producing continuously flow of data. Examples of streaming data include sensor networks, customer click streams, telephone records, web logs, multimedia data, sets of retail chain transactions, etc. These data sources are characterized by high-speed flow of huge amounts of data generated from non stationary distributions. In consequence, new learning techniques are needed to process streaming data in reasonable time and space. The goal of this tutorial is to present and discuss the research problems, issues and challenges in learning from data streams. We will present the state-of-the-art techniques in change detection, clustering, classification, frequent patterns, and time series analysis from data streams. We will discuss the current trends, challenges and open issues and future directions in learning from data streams,

Specific goals and objectives

  • Introducing the area of data stream mining
  • Giving a detailed explanation of the major techniques in the area
  • Emphasizing the open research issues and challenges

Biography

Joao Gama is a researcher at LIAAD-INESC Porto LA, the Laboratory of Artificial Intelligence and Decision Support of the University of Porto.

His main research interest is Learning from Data Streams. He has published several articles in change detection, learning decision trees from data streams, hierarchical clustering from streams, etc. Editor of special issues on Data Streams in Intelligent Data Analysis, J. Universal Computer Science, and New Generation Computing.
Co-chair of ECML 2005 Porto, Portugal 2005, and of a series of Workshops on Knowledge Discovery in Data Streams, ECML 2004, Pisa, Italy, ECML 2005, Porto, Portugal, ICML 2006, Pittsburg, US, ECML 2006 Berlin, Germany, SAC2007, Korea, and the ACM Workshop on Knowledge Discovery from Sensor Data to be held in conjunction with ACM SIGKDD 2007.
Together with M. Gaber edited the book Learning from Data Streams-Processing Techniques in Sensor Networks, published by Springer.

 

Saso Dzeroski
(Institute Jozef Stefan, Slovenia)

Title: Constraint-Based Data Mining and Inductive Queries

Tutorial Summary

In its most general formulation, the task of data mining is to find patterns in data: As such it is vastly underspecified. To make the task more precise, we first have to specify the type of patterns considered (where the word pattern is taken in a broader sense to include frequent patterns, predictive models or other regularities in the data, e.g., clusters). We then have to specify what conditions the patterns have to satisfy in order to consider them as solutions to the data mining task at hand. In constraint-based data mining, the conditions that a pattern has to satisfy are called constraints, stated explicitly and under direct control of the user/data miner.

Constraints play an important role in the area of inductive databases and inductive queries, where a database perspective on knowledge discovery is taken in which knowledge discovery processes become query sessions. Inductive queries can be used to mine patterns from data, as well as apply patterns to data and KDD becomes an extended querying process. Inductive queries consist of constraints which the patterns of interest have to satisfy and are hence closely related to constraint-based data mining.

The tutorial will introduce the research areas of inductive databases/queries and constraint-based data mining. It will give an overview of the different types of constraints commonly considered as well as selected constraint-based data mining algorithms for different data mining tasks. In particular, constraint-based mining of frequent patterns, predictive models and clustering will be considered. We will also discuss current and future research directions and challenges in these areas.

Specific goals and objectives

  • Introduce the areas of inductive queries and constraint-based data mining
  • Give an overview of the different types of constraints considered in different data mining tasks and illustrative constraint-based data mining algorithms
  • Discuss current and future research challenges and directions

Biography

Saso Dzeroski is a scientific councillor at the Jozef Stefan Institute, Deptartment of Knowledge Technologies, and an associate professor at the Jozef Stefan International Postgraduate School, both in Ljubljana, Slovenia.

His research interests are in the areas of Data Mining, Machine Learning, and Knowledge Discovery in Databases, and their applications. More specifically, on the methodology side they focus on Computational Scientific Discovery / Equation Discovery and Constraint-Based Data Mining / Inductive Queries. On the application side, his focus is on applications in Environmental Sciences (ecological modelling) and Life Sciences (bioinformatics and systems biology).

Besides research and publications related to the topics of this tutorial, he is the coordinator of the EU funded project IQ (Inductive Queries for Mining Patterns and Models). He has co-organized two workshops on this topic, namely two editions of the international workshop Knowledge Discovery in Inductive Databases held at ECML/PKDD (KDID-03 and KDID-06). With Jan Struyf, he has edited a book on this topic, based on the KDID-06 workshop, published by Springer.

 

Last modified  by webmaster.