Self-Tuning BI Systems

For the realization of BI for masses, exploratory BI, self-service BI, and similar concepts, it is necessary to enable all users to self-support themselves in the analytical and maintenance tasks they need to perform. The user-centricity feature of these systems seeks enabling non-technical users to analyze data on demand. Thus, next generation BI systems should provide flexible means for such users to create the desired reports / data analysis. This assumption means that the system should be self-configurable and react to the day-by-day usage. For this reason, continuous monitoring of the system must take place in order to overcome potential bottlenecks of any kind (such as performance, information, design or quality bottlenecks).

In this thesis we propose to monitor the BI system, gather relevant metadata for the assessment of the system, and according to past evidences develop self-tuning features. In order to fulfill this objective several tasks must be taken. First, the main storage alternatives must be characterized (also including NoSQL trends). However, this classification should not only be model-based (e.g., relational, key-value, document-stores, graph databases, etc.) as it usually done) but also consider other decisions such as the system architecture (e.g., hash-based, clustered, in-memory, disk-based), design (e.g., fragmentation and replication capabilities, indexing), optimizations implemented by the query execution engine, etc.

Once a clear characterization is done and what factors are relevant to choose between different storage options given a certain workload (i.e., past evidences gathered in the system), the desired output would be a deterministic algorithm (probably cost-based) to enable self-tuning BI systems and, in turn, more user-friendly BI tools that bridge the gap between business needs and IT limitations.

Main Advisor at Universitat Politècnica de Catalunya (UPC)
Co-advisor at Technische Universität Dresden (TUD)