PhD on Visual Analytics of Data Errors

 Applications are invited for a PhD fellowship/scholarship at Graduate School of Natural Sciences, Aarhus University, Denmark, within the Computer Science programme. The position is available from August 2022 or later.


Visual Analytics of Data Errors

Research area and project description:
A core problem in data-driven science is poor data quality. The standard solution is to fight it, with even the academic literature not being too shy to call data quality problems “the enemy”. The weapons in this fight are automated error detection and cleaning, aiming to rid the data of its errors.

This project takes a completely different approach: instead of fighting erroneous data, it aims to systematically mine and analyze them, so as to capture the meta-information they hold and to find the underlying problems that caused them. In doing so, it follows the hypothesis that the occurrence of errors, their type, their distribution across a dataset, their frequency, their co-occurrence with each other, with other data characteristics, or with specific data processing steps are highly valuable meta-information that if analyzed in their own right can yield insights that the well-formed, valid data by itself cannot. Hence, this research project aims to embrace erroneous data as a source for important information that should not be discarded, but instead be analyzed and reported as first-class data properties themselves.

The challenge in this research is to cope with the unspecific nature of errors that does not follow a given schema or definition and that in most cases cannot be found through database queries or standard statistics. To address this challenge, this project follows the principle of mixed-initiative analysis that combines the computational power of modern IT (e.g., automatic learning of data models) with the knowledge of domain experts (e.g., performing plausibility checks on the learned models). This allows the human user to gauge the erroneousness of data and to parametrize their inclusion in the analysis of errors. This mixed-initiative approach is heavily facilitated by data visualization, which provides the interactive interface between the computer and the analyst: computational results are added to the visualization by the computer, while the analyst uses the visualization to trigger, steer, and configure computations.

While all developed methods, algorithms, and software will be generally applicable, the project will be bootstrapped with concrete use cases from the food sciences. These use cases will focus mainly on low-quality mass spectrometry data containing many missing values and being afflicted by a low signal-to-noise ratio.

Project description (½-4 pages). This document should describe your ideas and research plans for this specific project. If you wish to, you can indicate an URL where further information can be found. Please note that we reserve the right to remove scientific papers, large reports, theses and the like.

Qualifications and specific competences:
The applicant must hold a MSc or a BSc and have completed at least 1 year of MSc in Computer Science or Data Science. Prior experience in at least one of the following areas is of advantage: data visualization, visual analytics, computer graphics, human-computer interaction, machine learning, or database technologies. Excellent computer programming skills and collaboration skills are also required.

Place of employment and place of work:
The place of employment is Aarhus University, and the place of work is Department of Computer Science, Åbogade 34, DK-8200 Aarhus N., Denmark. 

Applicants seeking further information for this project are invited to contact:
Associate Professor Hans-Jörg Schulz, e-mail:

How to apply:
For information about application requirements and mandatory attachments, please see the Application guide. Please read the Application guide thoroughly before applying and note the GSNS language skills requirement.

When ready to apply, go to (Note, the online application system opens 1 March 2022)

  1. Choose May 2022 Call with deadline 1 May 2022 at 23:59 CEST.
  2. You will be directed to the call and must choose the programme “Computer Science”.
  3. When filling in information about the project, please choose: “Visual Analytics of Data Errors (ViADaE)” in the dropdown menu in the box named “Study”.

Please note:

  • The programme committee may request further information or invite the applicant to attend an interview.
  • The project will only be initiated if final funding (from the graduate school/the faculty) is secured.

Aarhus University’s ambition is to be an attractive and inspiring workplace for all and to foster a culture in which each individual has opportunities to thrive, achieve and develop. We view equality and diversity as assets, and we welcome all applicants. All interested candidates are encouraged to apply, regardless of their personal background.

No comments:

Post a Comment

Search This Blog

The Regensburg University of Applied Sciences in Germany invites application for (13) vacant Research and Academic Positions

The Regensburg University of Applied Sciences in Germany invites application for (13) vacant Research and Academic Positions