I am a computer scientist working as a postdoctoral scholar at Globus Labs led by Prof. Ian Foster in the Department of Computer Science at the University of Chicago since November 2022. I am also the founder and elected chair of the SPEC RG Predictive Data Analytics Working Group.
In a nutshell, my research aims to expand the potential of data science in scientific computing. In other words, my research contributes to key aspects of developing intuitive, efficient, and sustainable data science solutions across disciplines and domains. In addition, my research is inherently interdisciplinary, and I apply a translational approach as I work to develop, apply, and evaluate methods and techniques in various domains. In the long term, I expect that my research will contribute to the creation of dynamic data science ecosystems that will inherently accelerate scientific computing applications.
My primary research interests focus on the following areas integrating experience and expertise from performance engineering and data science, but is not limited to:
- Data science: My focus is on data analytics, clustering, and imputation. I am also interested in benchmarking and developing data analytics methods. In addition, my research is inherently interdisciplinary, and I apply a translational approach to transfer methods and techniques in various domains.
- Data science clouds: I am interested in the development, autonomous management (i.e., autonomous scaling of resources), benchmarking of various building blocks of such clouds, and runtime prediction and scheduling of data analytic tasks.
- Data management: The focus is on FAIR (findable, accessible, interoperable, and reusable) data management and the promotion of publicly available research data.
- Data privacy: The idea here is to exchange data with third parties, preserving the privacy of the data. I am interested in synthetic data generation and homomorphic encryption.
- Sustainable data science: Specifically, this involves the development of an energy efficiency benchmark for Deep and Machine Learning.
|Nov 18, 2023||I accepted the invitation as program committee member at the 13th International Conference on Data Science, Technology and Applications (DATA).|
|Nov 7, 2023||I accepted the invitation as artifacts evaluation program committee member at the 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID).|
|Oct 30, 2023||Our paper, “Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision”, has been accepted for publication at the 2023 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies conference (BDCAT). Focusing on the transformative impact of deep learning, especially with Trillion Parameter Models such as Huawei’s PanGu-Σ, our paper presents a visionary ecosystem tailored to meet the evolving needs of the scientific community. Delving into the technical challenges associated with serving Trillion Parameter Models for groundbreaking discoveries, the paper outlines essential requirements for a robust software stack and flexible interfaces.|
|Oct 27, 2023||Our article, “De Bello Homomorphico: Investigation of the extensibility of the OpenFHE library with basic mathematical functions by means of common approaches using the example of the CKKS cryptosystem”, has been accepted for publication in the International Journal of Information Security. In the context of the growing number of IoT devices and the associated increased threat of cyber attacks, our study addresses the challenge of selecting the right Secure Group Communication (SGC) scheme. The comprehensive evaluation of 34 schemes, considering computational and communication costs along with security features, is presented. Our use of decision trees simplifies the selection process for centralized, distributed, and decentralized SGC schemes. Arm yourself with these insights for robust IoT security.|
|Oct 27, 2023||Our paper, “Tournament-Based Pretraining to Accelerate Federated Learning”, has been accepted for presentation at the 4th ACM International Workshops on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S). In this work, we introduce three innovative variants of a serverless federated learning framework, specifically addressing challenges associated with leveraging edge data. We introduce tournament-based pretraining that significantly enhances model performance. With these federated learning advancements, we aim to enable researchers to move beyond hurdles and focus on advancing scientific applications.|
An Empirical Study of Container Image Configurations and Their Impact on Start TimesIn Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), May 2023
Methodological Principles for Reproducible Performance Evaluation in Cloud ComputingIEEE Transactions on Software Engineering (TSE), Aug 2021
Libra: A Benchmark for Time Series Forecasting MethodsIn Proceedings of the 12th ACM/SPEC International Conference on Performance Engineering (ICPE), Apr 2021
Time Series Forecasting for Self-Aware SystemsProceedings of the IEEE, Jul 2020
Telescope: An Automatic Feature Extraction and Transformation Approach for Time Series Forecasting on a Level-Playing FieldIn Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE), Apr 2020