MS.6. Modeling Data warehouses with Multiversion and Temporal Functionality

A data warehouse (DW) integrates data stored in multiple distributed, heterogeneous, and autonomous data sources, deployed across an organisation. An inherent feature of data sources is that they evolve in time independently of a DW that integrates them. The evolution of data sources can be characterised by content changes (i.e., insert/update/delete data) and schema changes (i.e., add/modify/drop a data structure). The latter are more difficult to handle since they have an impact on multiple layers of a DW architecture, including the ETL process, the data warehouse itself, and analytical applications. In practice, both content and schema changes take place frequently and have to be propagated to a DW.

Temporal data warehouses and multiversion data warehouses are two approaches that have been proposed to handle DW evolution. However, neither of these approaches provides a fully functional solution to the problem. Temporal data warehouses take advantage of all research done in temporal databases and temporal query languages, including the existence of specialised index structures. They are relatively easy to implement but their disadvantage is their difficulty to handle schema changes. On the other hand, multiversion data warehouses support the management of both data and schema changes. However, their implementation is more complex, they posses limited capabilities for querying DW versions, and only a few index structures for multiversion data have been developed. For this reason, a natural approach to solving the DW evolution problem is to combine the temporal and multiversion approaches into a single solution.

The aim of this topic is to: (1) develop a conceptual model that will capture multiple versions of a DW schema and temporal evolution of multidimensional data, (2) define constraints that will preserve consistency of a DW schema and its data, (3) provide a transformation from the conceptual model into a logical and a physical (implementation) model, (4) develop a query language, (5) implement a prototype system with a graphical user interface, (6) test the applicability of the prototype.

Main Advisor at Université Libre de Bruxelles (ULB)
Co-advisor at Poznan University of Technology (PUT)