LSP.10. Bitmap Indexing for Big Data

A range of novel platforms for Big Data have emerged in recent years, e.g., Cassandra, Hadoop, Hive, Lucene, Presto, Shark and Spark. Such platforms support complex data beyond relational data, e.g., text, graphs, social network updates, geo data, etc. The platforms generally achieve their scalability by massive use of parallelism and main memory, rather than the advanced index structures found, e.g., in classical DBMSes. Bitmap indices provide very fast performance for queries that search on varying combinations of attributes and values in high-dimensional data. However, bitmap indices are not well supported by Big Data platforms.

This project will develop a bitmap indexing framework for Big Data and implement it in a one or more selected Big Data platforms. This entails several research challenges. First, the framework should support a wide variety of different types of Big Data. Second, the framework should be highly scalable. Third, the framework should utilize novel hardware architectures, e.g., novel CPU instructions or GPUs, while still being deployable on cloud computing platforms where the hardware is virtualized.

Main Advisor at Aalborg Universitet (AAU)
Co-advisor at Universitat Polit├Ęcnica de Catalunya (UPC)