Multi-Source Event Detection (MUSED)
MUSED is a research project fully funded by the Department of Computer and Information Science (IDI) at NTNU through the department's strategic research program.
The MUSED project seeks to solve challenges in event detection and prediction in multi-source data streams in the context of Big Data, which is characterized by the unprecedented volume of data; the velocity of data generation; the variety of the structure of data; and veracity of data, i.e., the variation of quality of data. Despite current advances in technology and methods to handle Big Data, these characteristics still rise issues related to both data processing and analytics methods. At the same time, the societal impacts of Big Data makes it to remain an important research area.
Large-scale analytics, or Big Data analytics, is the very core of Big Data, and can be performed in order to gain deeper understanding of the vast amount of data available. Approaches for Big Data analytics can broadly be divided into those based on data mining/machine learning techniques, e.g., prediction, and those based on query processing, e.g., aggregation and ranking. A subarea of the former is event detection and prediction, which involves detection and prediction of (possibly new) events from a large-scale multi-source data streams. Here, an event refers to real-life events, such as "virus outbreak", "traffic jam", "live concerts" and "riots", and is defined as something happens at a specific time and specific location.
The main challenge with Big Data is that much of the available data is based on heterogeneous sources in form of data and information streams, often having complex relationships among them, and might implicitly or explicitly contain temporal and spatial dimensions.
In the project, we explore novel techniques to effectively exploit the inherent Big Data characteristics, including
- Handling heterogeneity of data sources
- Dealing with semantic mismatch of events
- Coping with non-explicitly mentioned events
- Spatio-temporal text analytics
Objectives and Outcomes
MUSED aims at developing the framework and techniques necessary for effective and efficient detection and prediction of events from multiple and possibly heterogeneous streaming data sources. We expect to provide contributions on the following research topics:
- Efficient and effective information analysis techniques that are able to detect and predict events in real-time when the sources of data come from multiple and heterogeneous streams.
- Efficient algorithms and structures for indexing and storage to support highly scalable multi-source event detection and prediction.
The contributions will be disseminated in international refereed conferences and journals, and the MUSED prototype will be made available as open-source for the research community. In addition, a total of one doctoral thesis and at least four master theses related to MUSED will be completed.