MS.10. Graph Data Warehouses

The data warehouse provides a subject-oriented and integrated view of data, which makes it a suitable backbone for common analysis techniques such as reporting, complex interactive querying, and data mining algorithms.

OLAP is a common analysis technique in data warehouses. Cubes are placed into the so-called multidimensional space. The analysis follows a fact/dimension dichotomy to study business facts according to a set of metrics.

On the other side, graphs have the benefit of revealing valuable insights from both the network structure and the data embedded within the structure. The great expressive power of graphs encourages their use in extremely diverse domains, especially for modeling structural relationships. Complex real-world problems, such as intelligent transportation, social and biological network analysis, could be abstracted and solved using graph structures and algorithms.

Many recent research projects have tackled the problem of multidimensional analysis of graph data. Many graph modeling techniques and databases tools have been developed.

Real world graphs are heterogeneous, i.e. include different types of nodes and edges, and evolving, i.e. the values of the attributes vary across a discrete domain. The support for heterogeneous and dynamic graphs, as well as the design of a model for graph OLAP workload remain to be investigated, however.

The aim of this topic is to design a framework for multidimensional analysis of graph data, which includes developing: (1) a conceptual model that supports the heterogeneity and captures the evolving aspect of the graph (2) a query language à-la-MDX for querying graph cubes, and (3) a prototype system built on top of distributed graph processing and storage frameworks.

Main Advisor at Université Libre de Bruxelles (ULB)
Co-advisor at Universitat Politècnica de Catalunya (UPC)