Welcome!
I am an Assistant Professor in the Department of Computer Science at the Illinois Institute of Technology and the founder and elected chair of the SPEC RG Predictive Data Analytics Working Group.
The overarching goal of my research is to expand the potential of data science in scientific computing by designing robust, efficient, and sustainable system solutions tailored to the evolving needs of data-driven science. As scientific progress increasingly depends on the effective use of data science ecosystems, the diversity of hardware architectures, application demands, and usage patterns poses significant challenges. My work addresses these complexities through a focus on systems and performance engineering, leveraging interdisciplinary expertise to optimize and adapt scientific computing infrastructures for emerging data science applications.
In particular, I see the following key challenges that need to be addressed:
- Resource Optimization: Efficiently allocating and managing resources across diverse hardware platforms to accommodate changing, data-intensive workloads.
- Adaptive Systems: Developing systems that can dynamically adapt to evolving data and model requirements.
- Data Security and Privacy: Safeguarding sensitive data while enabling collaborative data science.
- System-Level Optimization: Optimizing the complex interplay of components within data science ecosystems for maximum performance.
- Sustainable Computing: Minimizing the environmental impact of data science practices.
Most Recent News
Apr 30, 2025 | I am honored to receive the Computer Science Teacher of the Year award! Grateful for the amazing students, colleagues, and community. |
Apr 21, 2025 | Our research paper, “TACO: A Lightweight Tree-based Approximate Compression Method for Time Series”, has been accepted for presentation at the 14th International Conference on Data Science, Technology and Applications, DATA (DATA 2025). As time series data is growing, storage and transmission are major challenges. We present TACO: a fast, training-free, tree-based compression method. In comparision to the state-of-the-art, it has no strong assumptions and offers selective decompression. |
Apr 7, 2025 | I accepted the invitation as program committee member at the 16th Symposium on Software Performance (SSP). |
Mar 28, 2025 | I accepted the invitation as program committee member at the 6th International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). |
Feb 25, 2025 | Our research paper, “Microservice Applications and Their Workloads on GitHub”, has been accepted for presentation at the 8th Workshop on Hot Topics in Cloud Computing Performance (HotCloudPerf). Many cloud applications use microservices for scalability and flexibility, but their distributed nature poses challenges. Performance engineers rely on representative applications for evaluation, yet existing benchmarks are limited and their industry relevance is debated. To explore alternatives, we mine GitHub for microservice applications and workloads, creating two datasets: 553 applications and 8 workload repositories. Our analysis offers a foundation for future research in microservice benchmarking. |
Selected Publications
- The Globus Compute Dataset: An Open Function-as-a-Service Dataset From the Edge to the CloudFuture Generation Computer Systems, Apr 2024
- An Empirical Study of Container Image Configurations and Their Impact on Start TimesIn Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), May 2023
- Methodological Principles for Reproducible Performance Evaluation in Cloud ComputingIEEE Transactions on Software Engineering (TSE), Aug 2021
- Libra: A Benchmark for Time Series Forecasting MethodsIn Proceedings of the 12th ACM/SPEC International Conference on Performance Engineering (ICPE), Apr 2021
- Time Series Forecasting for Self-Aware SystemsProceedings of the IEEE, Jul 2020
- Telescope: An Automatic Feature Extraction and Transformation Approach for Time Series Forecasting on a Level-Playing FieldIn Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE), Apr 2020