LSP.21. Data Warehouses over distributed storage and distributed processing frameworks

Due to the Big Data emergence, many distributed storage and distributed processing frameworks have appeared, such as Hadoop or Spark. Over such frameworks, multiple NoSQL storage systems were built, e.g., HBase. Finally, relational database layers on top of such noSQL stores has been developed, e.g., Apache Phoenix or Impala. There is a demand for analyzing data stored in these systems using a data warehouse architecture and OLAP-like processing. The PhD will provide foundations for assessing which technology is suitable for which type of data characteristics and application requirements.

The research will be based on theoretical analysis of existing solutions, building cost models for assessing their applicability. Then the theoretical foundations will be proved by a series of experiments.

Main Advisor at Université Libre de Bruxelles (ULB)
Co-advisor at Poznan University of Technology (PUT)