Dr. rer. nat. André Bauer

I am a computer scientist working as a postdoctoral scholar at Globus Labs led by Prof. Ian Foster in the Department of Computer Science at the University of Chicago since November 2022. I am also the founder and elected chair of the SPEC RG Predictive Data Analytics Working Group.

In a nutshell, my research aims to expand the potential of data science in scientific computing. In other words, my research contributes to key aspects of developing intuitive, efficient, and sustainable data science solutions across disciplines and domains. In addition, my research is inherently interdisciplinary, and I apply a translational approach as I work to develop, apply, and evaluate methods and techniques in various domains. In the long term, I expect that my research will contribute to the creation of dynamic data science ecosystems that will inherently accelerate scientific computing applications.

My primary research interests focus on the following areas integrating experience and expertise from performance engineering and data science, but is not limited to:

  • Data science: My focus is on data analytics, clustering, and imputation. I am also interested in benchmarking and developing data analytics methods. In addition, my research is inherently interdisciplinary, and I apply a translational approach to transfer methods and techniques in various domains.
  • Data science clouds: I am interested in the development, autonomous management (i.e., autonomous scaling of resources), benchmarking of various building blocks of such clouds, and runtime prediction and scheduling of data analytic tasks.
  • Data management: The focus is on FAIR (findable, accessible, interoperable, and reusable) data management and the promotion of publicly available research data.
  • Data privacy: The idea here is to exchange data with third parties, preserving the privacy of the data. I am interested in synthetic data generation and homomorphic encryption.
  • Sustainable data science: Specifically, this involves the development of an energy efficiency benchmark for Deep and Machine Learning.

Most Recent News

May 24, 2024 Our article, “Thinking in Categories: A Survey on Assessing the Quality for Time Series Synthesis”, has been accepted for publication in the International Journal of Data and Information Quality. As time series data are crucial yet often limited or confidential, we focus in this work on time series synthesis an important alternative. Despite the availability of numerous synthesis methods, evaluating their quality remains challenging. Our comprehensive survey defines what constitutes “good” synthesis and proposes a systematic evaluation procedure. This work aims to drive rigorous and reproducible research in the field of time series synthesis.
May 23, 2024 Our article, “Benchmarking of Secure Group Communication schemes with focus on IoT”, has been accepted for publication in the International Journal of Discover Data. With the proliferation of IoT devices comes an increase in cybersecurity threats. Unlike standard 1-to-1 communication, IoT requires efficient n-to-n encryption, which is achieved through Secure Group Communication schemes. However, the abundance of available schemes makes selecting the right one daunting. Our paper addresses this challenge by presenting a benchmark specifically tailored for IoT, assisting developers in their scheme selection process. We define business problems, design a specification-based benchmark, and extend it to a hybrid benchmark suitable for real-world IoT environments.
May 7, 2024 Our article, “Evaluation is key: a survey on evaluation measures for synthetic time series,” accepted for publication in the International Journal of Big Data, aims to clarify the evaluation of synthetic data generation for time series. Synthetic data generation models the distributions of real datasets to create new data, which is crucial in privacy-sensitive fields like healthcare. While image synthesis has been extensively studied, time series synthesis is equally vital for practical applications. Despite the availability of numerous models and measures, there is no consensus on defining or quantifying high-quality synthetic time series. Our comprehensive survey reviews various evaluation measures, provides clear definitions, organizes them into a taxonomy, and offers guidance on selecting appropriate measures.
Apr 16, 2024 I accepted the invitation as program committee member at the 20th IEEE International Conference on eScience (eScience).
Mar 5, 2024 Our research paper, “Unveiling Temporal Performance Deviation: Leveraging Clustering in Microservices Performance Analysis”, has been accepted for presentation at the 15th ACM/SPEC International Conference on Performance Engineering (ICPE). In the ever-expanding cloud computing landscape, performance is key. Our methodology tackles the challenge of identifying performance issues in microservices. By clustering containers based on performance at different time intervals, we unveil temporal deviations. Applied to the Alibaba dataset, we uncovered stable and dynamic performance patterns – an approach for enhancing overall performance and reliability in modern application landscapes!

Selected Publications

  1. The Globus Compute Dataset: An Open Function-as-a-Service Dataset From the Edge to the Cloud
    André Bauer, Haochen Pan, Ryan Chard, Yadu Babuji, Josh Bryan, Devesh Tiwari, Ian Foster, and Kyle Chard
    Future Generation Computer Systems, Apr 2024
  2. An Empirical Study of Container Image Configurations and Their Impact on Start Times
    Martin Straesser, André Bauer, Robert Leppich, Nikolas Herbst, Kyle Chard, Ian Foster, and Samuel Kounev
    In Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), May 2023
  3. Methodological Principles for Reproducible Performance Evaluation in Cloud Computing
    Alessandro V. Papadopoulos, Laurens Versluis, André Bauer, Nikolas Herbst, Jóakim Kistowski, Ahmed Ali-Eldin, Cristina Abad, J. Nelson Amaral, Petr Tuma, and Alexandru Iosup
    IEEE Transactions on Software Engineering (TSE), Aug 2021
  4. Libra: A Benchmark for Time Series Forecasting Methods
    André Bauer, Marwin Züfle, Simon Eismann, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev
    In Proceedings of the 12th ACM/SPEC International Conference on Performance Engineering (ICPE), Apr 2021
  5. Time Series Forecasting for Self-Aware Systems
    André Bauer, Marwin Züfle, Nikolas Herbst, Albin Zehe, Andreas Hotho, and Samuel Kounev
    Proceedings of the IEEE, Jul 2020
  6. Telescope: An Automatic Feature Extraction and Transformation Approach for Time Series Forecasting on a Level-Playing Field
    André Bauer, Marwin Züfle, Nikolas Herbst, Samuel Kounev, and Valentin Curtef
    In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE), Apr 2020
The list of all publications can be found here.