LSP.11. Querying Semantic Data with Cardinality Assurance

Recent advances and the growing popularity of the Semantic Web have produced a wealth of semantic datasets in RDF that are freely available on the Web. Using RDF as data format allows to conveniently store and publish data without having to enforce restrictions on the schema. On the contrary, data can be published first and a suitable schema can be derived afterwards enabling a gradual evolution of data together with its schema.

Embracing the Semantic Web principles, many semantic datasets are accessible via Web interfaces that support structured query languages, which ultimately enables processing analytical queries over Web data. However, due to the distributed and agile nature of RDF data and the vast amount of it, users do not possess deep knowledge of the data, which impedes the formulation of queries. As a consequence, queries over these datasets often lead to unexpected query results, especially in term of their cardinality; results sets might be empty, or contain too few or too many results. Therefore, users have to undertake a cumbersome trial-and-error process to retrieve an acceptable result to their query.

The aim of this topic is to develop an approach that enables users to formulate queries in general and analytical queries in particular that lead to desirable results by developing and employing suitable techniques for query modification such as relaxation and generalization. As executing various queries can easily become expensive, the system needs to consider performance and techniques for result size estimation. As a modified query might deviate from the user’s intent, the system also needs to ensure that the executed query is still similar to the original query defined by the user.

Main Advisor at Aalborg Universitet (AAU)
Co‐advisor at Technische Universität Dresden (TUD)