Paper submitted to ACM ICS 2026
PruneX: A Hierarchical Communication-Efficient System for Distributed CNN Training with Structured Pruning
Postdoctoral Researcher
Information Technology Department
Faculty of Science and Engineering
Åbo Akademi University, Vaasa, Finland
|
Recent activities, publications, and announcements.
PruneX: A Hierarchical Communication-Efficient System for Distributed CNN Training with Structured Pruning
Federated Learning With ℓ0 Constraint Via Probabilistic Gates For Sparsity
New courses at Åbo Akademi University with hands-on access to LUMI, Puhti, and Mahti supercomputers
Started as Postdoctoral Researcher in Information Technology Department
Postdoctoral researcher focused on machine learning systems (MLSys) for energy-efficient, large-scale AI. My work addresses communication, memory, and energy bottlenecks in distributed training and inference of foundation models.
I design and implement scalable ML systems on HPC platforms, bridging algorithmic sparsity with distributed execution models. My research vision: performance optimization is energy optimization—reducing communication, memory traffic, and redundant computation directly translates into sustainability gains for AI infrastructure.
Deep expertise in high-performance computing, distributed systems, and ML infrastructure for building scalable, energy-efficient AI systems.
Shiraz Univ. of Tech., Iran
UFSC, Brazil
Univ. of Bologna, Italy
Åbo Akademi, Finland
My research focuses on making AI systems faster, more efficient, and scalable—from training foundation models to deploying them in production.
Scalable training strategies for LLMs and vision transformers using parallelism techniques:
Reducing computational and memory costs for model deployment:
System-level optimizations for maximum hardware utilization:
Algorithms for decentralized and federated machine learning:
15 publications spanning machine learning systems, distributed optimization, and control theory. ORCID · Full list
Conference presentations, invited talks, and workshop lectures.
CSC Finland · Spring 2025
Hands-on parallel programming and performance optimization techniques for graduate students and researchers.
ACM ICS 2026 (Submitted)
Structured pruning co-designed with cluster topology for scalable multi-node GPU training.
IEEE CASE 2024
Sparse consensus optimization in multi-agent networked systems.
CoDIT 2023, Rome, Italy
Novel tracking-based approach for cardinality-constrained distributed optimization.
Åbo Akademi University · 2025+
Graduate course covering CUDA programming, memory optimization, and kernel development.
Responsible for designing, modernizing, and delivering courses at Åbo Akademi University, with integration of national CSC HPC infrastructures (Puhti, Mahti, LUMI).
Research software for distributed machine learning, sparse optimization, and HPC.
Hierarchical pruning-aware distributed training system. Co-designs structured pruning with cluster topology for up to 60% communication reduction.
GPU-accelerated Parallel Sparse Fitting Toolbox for distributed sparse machine learning with bi-linear consensus ADMM.
Sparse Convex Optimization Toolkit—a mixed-integer framework for exact sparse convex optimization problems.
Citation metrics and academic footprint.
Last updated: March 2026 · View Google Scholar Profile
Open to collaboration on energy-efficient ML systems, distributed training, and HPC research.
Location
Vaasa, Finland
Affiliation
Åbo Akademi University
Information Technology Department