BDA.19. New Data Models and Storage for Efficient Sequential Data Processing

Large volumes of data arrive to analytical systems in order. Some typical examples of applications generating ordered data include: workflow management systems, website monitors for click stream analysis, network packets monitoring, RFID infrastructures, public transportation infrastructures, intelligent buildings, and various smart meters. This order implies the existence of various patterns, that in turn, provide additional piece of information of business value. Business users are willing to analyze sequential data in an OLAP-like style, organized as data cubes that store facts, measures, dimensions, and hierarchies. To this end, dedicated data models, storage, indexing, and query languages are needed.

Commercial systems, including Oracle, Teradata Aster, Hive, Spark offer query languages for analyzing sequences in an OLTP-like style, focusing on searching for patters. They, however, do not support OLAP-like analysis of such data. Moreover, from the 22 research approaches we are aware of, none fully solves the problems related to analyzing sequential data. The existing proposals use one of the four storage models: relational, array, noSql, and graph. To the best of our knowledge, few indexing techniques for sequential data have been proposed.

Moreover, some research has already been done in optimizing pattern search but an overall approach to optimizing queries on sequential data is still missing (to the best of our knowledge). This particularly applies to optimizing queries in sequential data warehouses.

For this reason, this project aims at: (1) developing a storage model suitable for efficient processing of sequential data, (2) developing index data structures supporting pattern search and OLAP-like processing, (3) developing novel and efficient query optimization algorithms based on the proposed model and indexes, (4) evaluating the performance of the proposed storage model, indexes, and algorithms both analytically and experimentally.

Main Advisor at Poznan University of Technology (PUT)
Co-advisor at Universitat Polit├Ęcnica de Catalunya (UPC)