Welcome!

me.jpg

Dr. rer. nat. André Bauer

I am a computer scientist working as a postdoctoral scholar at Globus Labs led by Prof. Ian Foster in the Department of Computer Science at the University of Chicago since November 2022. I am also the founder and elected chair of the SPEC RG Predictive Data Analytics Working Group.

In a nutshell, my research aims to expand the potential of data science in scientific computing. In other words, my research contributes to key aspects of developing intuitive, efficient, and sustainable data science solutions across disciplines and domains. In addition, my research is inherently interdisciplinary, and I apply a translational approach as I work to develop, apply, and evaluate methods and techniques in various domains. In the long term, I expect that my research will contribute to the creation of dynamic data science ecosystems that will inherently accelerate scientific computing applications.

My primary research interests focus on the following areas integrating experience and expertise from performance engineering and data science, but is not limited to:

  • Data science: My focus is on data analytics, clustering, and imputation. I am also interested in benchmarking and developing data analytics methods. In addition, my research is inherently interdisciplinary, and I apply a translational approach to transfer methods and techniques in various domains.
  • Data science clouds: I am interested in the development, autonomous management (i.e., autonomous scaling of resources), benchmarking of various building blocks of such clouds, and runtime prediction and scheduling of data analytic tasks.
  • Data management: The focus is on FAIR (findable, accessible, interoperable, and reusable) data management and the promotion of publicly available research data.
  • Data privacy: The idea here is to exchange data with third parties, preserving the privacy of the data. I am interested in synthetic data generation and homomorphic encryption.
  • Sustainable data science: Specifically, this involves the development of an energy efficiency benchmark for Deep and Machine Learning.

Most Recent News

Jul 18, 2024 Our research paper, “An Empirical Investigation of Container Building Strategies and Warm Times to Reduce Cold Starts in Scientific Computing Serverless Functions”, has been accepted for presentation at the 20th IEEE International Conference on e-Science (eScience). Serverless computing abstracts infrastructure, letting developers focus on code. Yet, “cold start” latency, the cost to deploy environments, can hinder scientific computing with its sporadic demands. Our study tackles this by pre-installing Python packages in container images. Analyzing data from Globus Compute and Binder, we evaluate four container strategies. Pre-installed packages reduce cold start time but need more storage, while dynamic installs save space but add delays. Our simulator shows moderate warm times can cut cold starts without heavy overhead.
Jul 2, 2024 I accepted the invitation as program committee member at the 15th Symposium on Software Performance (SSP).
Jun 15, 2024 I accepted the invitation as chair for the data challenge at the 16th ACM/SPEC International Conference on Performance Engineering (ICPE).
May 24, 2024 Our article, “Thinking in Categories: A Survey on Assessing the Quality for Time Series Synthesis”, has been accepted for publication in the International Journal of Data and Information Quality. As time series data are crucial yet often limited or confidential, we focus in this work on time series synthesis an important alternative. Despite the availability of numerous synthesis methods, evaluating their quality remains challenging. Our comprehensive survey defines what constitutes “good” synthesis and proposes a systematic evaluation procedure. This work aims to drive rigorous and reproducible research in the field of time series synthesis.
May 23, 2024 Our article, “Benchmarking of Secure Group Communication schemes with focus on IoT”, has been accepted for publication in the International Journal of Discover Data. With the proliferation of IoT devices comes an increase in cybersecurity threats. Unlike standard 1-to-1 communication, IoT requires efficient n-to-n encryption, which is achieved through Secure Group Communication schemes. However, the abundance of available schemes makes selecting the right one daunting. Our paper addresses this challenge by presenting a benchmark specifically tailored for IoT, assisting developers in their scheme selection process. We define business problems, design a specification-based benchmark, and extend it to a hybrid benchmark suitable for real-world IoT environments.

Selected Publications

  1. The Globus Compute Dataset: An Open Function-as-a-Service Dataset From the Edge to the Cloud
    André Bauer, Haochen Pan, Ryan Chard, Yadu Babuji, Josh Bryan, Devesh Tiwari, Ian Foster, and Kyle Chard
    Future Generation Computer Systems, Apr 2024
  2. An Empirical Study of Container Image Configurations and Their Impact on Start Times
    Martin Straesser, André Bauer, Robert Leppich, Nikolas Herbst, Kyle Chard, Ian Foster, and Samuel Kounev
    In Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), May 2023
  3. Methodological Principles for Reproducible Performance Evaluation in Cloud Computing
    Alessandro V. Papadopoulos, Laurens Versluis, André Bauer, Nikolas Herbst, Jóakim Kistowski, Ahmed Ali-Eldin, Cristina Abad, J. Nelson Amaral, Petr Tuma, and Alexandru Iosup
    IEEE Transactions on Software Engineering (TSE), Aug 2021
  4. Libra: A Benchmark for Time Series Forecasting Methods
    André Bauer, Marwin Züfle, Simon Eismann, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev
    In Proceedings of the 12th ACM/SPEC International Conference on Performance Engineering (ICPE), Apr 2021
  5. Time Series Forecasting for Self-Aware Systems
    André Bauer, Marwin Züfle, Nikolas Herbst, Albin Zehe, Andreas Hotho, and Samuel Kounev
    Proceedings of the IEEE, Jul 2020
  6. Telescope: An Automatic Feature Extraction and Transformation Approach for Time Series Forecasting on a Level-Playing Field
    André Bauer, Marwin Züfle, Nikolas Herbst, Samuel Kounev, and Valentin Curtef
    In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE), Apr 2020
The list of all publications can be found here.