II.4. Programmable ETL

Nearly all current Extract‐Transform‐Load (ETL) tools are based on "visual programming" where boxes (representing steps or operations) are connected with lines (representing data flows). This gives a good overview and is – to some degree – self-documenting. The visual flow definitions can, however, also be cumbersome to handle, e.g., when loading data from many very similar but still not identical source systems in which case many nearly identical flows must be created. This involves rather much manual work and the ETL developer would benefit from easier ways to define the many similar ETL flows. One way to achieve this is by considering programming of ETL flows. In this topic, the vision is to create an ETL framework that enables use of, e.g., inheritance and templates as known from traditional programming. This can be based on written code or a combination of a GUI tool and code generation. The framework could allow both direct execution and translation into an existing ETL tool.

Main Advisor at Aalborg Universitet (AAU)
Co‐advisor at Poznan University of Technology (PUT) Main Advisor at Aalborg Universitet (AAU)
Co‐advisor at Université Libre de Bruxelles (ULB)