inproceedings MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns
IEEE International Conference on Cluster Computing (CLUSTER), 2024
IEEE International Conference on Cluster Computing (CLUSTER), 2024
article Analysis and prediction of performance variability in large-scale computing systems
The Journal of Supercomputing, 2024, , 1573–0484
The Journal of Supercomputing, 2024, , 1573–0484
article Enabling performance portability on the LiGen drug discovery pipeline
Future Generation Computer Systems (FGCS), 2024, 158, 44–59
Future Generation Computer Systems (FGCS), 2024, 158, 44–59
article Out of kernel tuning and optimizations for portable large-scale docking
experiments on GPUs
The Journal of Supercomputing, 2024, 80, 11798–11815
The Journal of Supercomputing, 2024, 80, 11798–11815
inproceedings Algorithm Selection of MPI Collectives Considering System Utilization
Euro-Par 2023: Parallel Processing Workshops, 2024, 302–307
Euro-Par 2023: Parallel Processing Workshops, 2024, 302–307
inproceedings SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs
International Workshop on OpenCL and SYCL (IWOCL), 2024, 1:1–1:12
International Workshop on OpenCL and SYCL (IWOCL), 2024, 1:1–1:12
inproceedings Unlocking performance portability on LUMI-G supercomputer: A virtual screening case study
International Workshop on OpenCL and SYCL (IWOCL), 2024, 9:1–9:4
International Workshop on OpenCL and SYCL (IWOCL), 2024, 9:1–9:4
inproceedings Domain-Specific Energy Modeling for Drug Discovery and Magnetohydrodynamics Applications
International Workshop on the Environmental Sustainability of High-Performance Software (SHiPS), 2023, 1789–1800
International Workshop on the Environmental Sustainability of High-Performance Software (SHiPS), 2023, 1789–1800
inproceedings SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023, 69:1–69:13
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023, 69:1–69:13
inproceedings An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing
23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2023, 82–93
23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2023, 82–93
inproceedings EMPI: Enhanced Message Passing Interface in Modern C++
23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2023, 141–153
23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2023, 141–153
inproceedings Towards a SYCL API for Approximate Computing
Proceedings of the 2023 International Workshop on OpenCL (IWOCL and SYCLcon), 2023, 17:1–17:2
Proceedings of the 2023 International Workshop on OpenCL (IWOCL and SYCLcon), 2023, 17:1–17:2
inproceedings An Analysis of Long-Tailed Network Latency Distribution and Background Traffic on Dragonfly+
Benchmarking, Measuring, and Optimizing - Third BenchCouncil International Symposium (Bench), 2023, 13852, 123–142
Best paper award
Benchmarking, Measuring, and Optimizing - Third BenchCouncil International Symposium (Bench), 2023, 13852, 123–142
Best paper award
inproceedings Celerity: How (Well) Does the SYCL API Translate to Distributed
Clusters?
International Workshop on OpenCL (IWOCL), 2022, 7:1–7:2
International Workshop on OpenCL (IWOCL), 2022, 7:1–7:2
inproceedings Towards a Portable Drug Discovery Pipeline with SYCL 2020
International Workshop on OpenCL (IWOCL), 2022, 5:1–5:2
International Workshop on OpenCL (IWOCL), 2022, 5:1–5:2
inproceedings FLEXDP: flexible frequency scaling for energy-delay product optimization
of GPU applications
19th ACM International Conference on Computing Frontiers (CF), 2022, 177–180
19th ACM International Conference on Computing Frontiers (CF), 2022, 177–180
inproceedings An Analysis of Performance Variability on Dragonfly+ Topology
IEEE International Conference on Cluster Computing (CLUSTER), 2022, 500–501
IEEE International Conference on Cluster Computing (CLUSTER), 2022, 500–501
inproceedings ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning
International European Conference on Parallel and Distributed Computing (Euro-Par), 2021, 12820, 3–18
International European Conference on Parallel and Distributed Computing (Euro-Par), 2021, 12820, 3–18
inproceedings The Italian research on HPC key technologies across EuroHPC
Computing Frontiers Conference (CF), 2021, 178–184
Computing Frontiers Conference (CF), 2021, 178–184
article Easy and efficient agent-based simulations with the OpenABL language and compiler
Future Generation Computer Systems (FGCS), 2021, 116, 61 – 75
Impact Factor: 6.125
Future Generation Computer Systems (FGCS), 2021, 116, 61 – 75
Impact Factor: 6.125
article Weight Pruning for Deep Neural Networks on GPUs
PARS-Mitteilungen, 2020, 35, 51–62
PARS-Mitteilungen, 2020, 35, 51–62
inproceedings SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing
International European Conference on Parallel and Distributed Computing (Euro-Par), 2020, 629–644
Acceptance rate: 24.7% (39/158)
International European Conference on Parallel and Distributed Computing (Euro-Par), 2020, 629–644
Acceptance rate: 24.7% (39/158)
inproceedings SYCL-Bench: A Versatile Single-Source Benchmark Suite for Heterogeneous
Computing
International Workshop on OpenCL (IWOCL), 2020, 10:1
International Workshop on OpenCL (IWOCL), 2020, 10:1
article Vectorization Cost Modeling for NEON, AVX and SVE
Performance Evaluation, 2020, 102106
Performance Evaluation, 2020, 102106
article Accurate Energy and Performance Prediction for Frequency-Scaled GPU Kernels
Computation, 2020, 8(2), 37
Computation, 2020, 8(2), 37
inproceedings Predictable GPUs Frequency Scaling for Energy and Performance
48th International Conference on Parallel Processing (ICPP), 2019, 52:1–52:10
Acceptance rate: 26.2% (106/405)
48th International Conference on Parallel Processing (ICPP), 2019, 52:1–52:10
Acceptance rate: 26.2% (106/405)
inproceedings Portable Cost Modeling for Auto-Vectorizers
IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2019, 359–369
Acceptance rate: 23.8% (29/122)
IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2019, 359–369
Acceptance rate: 23.8% (29/122)
inproceedings Celerity: High-level C++ for Accelerator Clusters
International European Conference on Parallel and Distributed Computing (Euro-Par), 2019, 11725, 291–303
Acceptance rate: 26% (38/144)
International European Conference on Parallel and Distributed Computing (Euro-Par), 2019, 11725, 291–303
Acceptance rate: 26% (38/144)
inproceedings A Performance Analysis of Vector Length Agnostic Code
International Conference on High Performance Computing & Simulation (HPCS), Workshop APPMM, 2019, 159–164
International Conference on High Performance Computing & Simulation (HPCS), Workshop APPMM, 2019, 159–164
inproceedings Approximating Memory-bound Applications on Mobile GPUs
International Conference on High Performance Computing & Simulation (HPCS), 2019, 329–335
Runner-up of Outstanding Paper Award
International Conference on High Performance Computing & Simulation (HPCS), 2019, 329–335
Runner-up of Outstanding Paper Award
inproceedings Local Memory-Aware Kernel Perforation
International Symposium on Code Generation and Optimization (CGO), 2018, 278–287
Acceptance rate: 28.6%
International Symposium on Code Generation and Optimization (CGO), 2018, 278–287
Acceptance rate: 28.6%
inproceedings Control Flow Vectorization for ARM NEON
Proceedings of the 21th International Workshop on Software and Compilers for Embedded Systems (SCOPES), 2018, 66–75
Acceptance rate: 39%
Proceedings of the 21th International Workshop on Software and Compilers for Embedded Systems (SCOPES), 2018, 66–75
Acceptance rate: 39%
inproceedings Accelerating the RICH Particle Detector Algorithm on Intel Xeon Phi
International Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP), 2018, 368–375
Acceptance rate: 34%
International Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP), 2018, 368–375
Acceptance rate: 34%
inproceedings OpenABL: A Domain-Specific Language for Parallel and Distributed Agent-Based Simulations
International European Conference on Parallel and Distributed Computing (Euro-Par), 2018, 505–518
Acceptance rate: 29%
International European Conference on Parallel and Distributed Computing (Euro-Par), 2018, 505–518
Acceptance rate: 29%
inproceedings Autotuning Stencil Computations with Structural Ordinal Regression Learning
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017, 287–296
Acceptance rate: 22% (116/516)
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017, 287–296
Acceptance rate: 22% (116/516)
inproceedings Static Optimization in PHP 7
Proceedings of the 26th International Conference on Compiler Construction (CC), 2017, 65–75
Acceptance rate: 24.5% (13/53)
Proceedings of the 26th International Conference on Compiler Construction (CC), 2017, 65–75
Acceptance rate: 24.5% (13/53)
inproceedings Stencil Autotuning with Ordinal Regression: Extended Abstract
Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES), 2017, 72–75
Research presentation in proceedings
Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES), 2017, 72–75
Research presentation in proceedings
inproceedings An Evaluation of Current SIMD Programming Models for C++
Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing (WPMVP), 2016, 3:1–3:8
Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing (WPMVP), 2016, 3:1–3:8
inproceedings Behavioral Spherical Harmonics for Long-Range Agents' Interaction
Workshop on Parallel and Distributed Agent-Based Simulations (PADABS), 2015, 392–404
Workshop on Parallel and Distributed Agent-Based Simulations (PADABS), 2015, 392–404
inproceedings Automatic Data Layout Optimizations for GPUs
International European Conference on Parallel and Distributed Computing (Euro-Par), 2015, 263–274
Acceptance rate: 27%
International European Conference on Parallel and Distributed Computing (Euro-Par), 2015, 263–274
Acceptance rate: 27%
inproceedings Point Distribution Tensor Computation on Heterogeneous Systems
Proceedings of the International Conference on Computational Science (ICCS), 2015, 160–169
Acceptance rate: 33%
Proceedings of the International Conference on Computational Science (ICCS), 2015, 160–169
Acceptance rate: 33%
article Spectral turning bands for efficient Gaussian random fields generation on GPUs and accelerators
Concurrency and Computation: Practice and Experience, 2015, 27, 4122–4136
Concurrency and Computation: Practice and Experience, 2015, 27, 4122–4136
article A Uniform Approach for Programming Distributed Heterogeneous Computing Systems
Journal of Parallel and Distributed Computing (JPDC), 2014, 74, 3228–3239
Impact Factor: 1.320
Journal of Parallel and Distributed Computing (JPDC), 2014, 74, 3228–3239
Impact Factor: 1.320
inproceedings Kd-Tree Based N-Body Simulations with Volume-Mass Heuristic on the GPU
Proceedings of the 2014 IEEE International Parallel and Distributed Processing Symposium Workshops, 2014, 1256–1265
Proceedings of the 2014 IEEE International Parallel and Distributed Processing Symposium Workshops, 2014, 1256–1265
inproceedings Random Fields Generation on the GPU with the Spectral Turning Bands Method
International European Conference on Parallel and Distributed Computing (Euro-Par), 2014, 656–667
Acceptance rate: 25.5%, best paper selection
International European Conference on Parallel and Distributed Computing (Euro-Par), 2014, 656–667
Acceptance rate: 25.5%, best paper selection
article Ethylene glycol revisited: Molecular dynamics simulations and visualization of the liquid and its hydrogen-bond network
Journal of Molecular Liquids , 2014, 189, 20–29
Fluid phase associations
Journal of Molecular Liquids , 2014, 189, 20–29
Fluid phase associations
inproceedings Automatic problem size sensitive task partitioning on heterogeneous parallel systems
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), abstract in proceedings, 2013, 281–282
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), abstract in proceedings, 2013, 281–282
inproceedings libWater: Heterogeneous Distributed Computing Made Easy
Proceedings of the 27th ACM International Conference on Supercomputing (ICS), 2013, 161–172
Acceptance rate: 21%
Proceedings of the 27th ACM International Conference on Supercomputing (ICS), 2013, 161–172
Acceptance rate: 21%
inproceedings An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning
Proceedings of the 27th ACM International Conference on Supercomputing (ICS), 2013, 149–160
Acceptance rate: 21%
Proceedings of the 27th ACM International Conference on Supercomputing (ICS), 2013, 149–160
Acceptance rate: 21%
inproceedings GPU Cost Estimation for Load Balancing in Parallel Ray Tracing
International Conference on Computer Graphics Theory and Applications (GRAPP), 2013, 139–151
Acceptance rate: 25% (21/83), project web page
International Conference on Computer Graphics Theory and Applications (GRAPP), 2013, 139–151
Acceptance rate: 25% (21/83), project web page
inproceedings Visual Data Mining Using the Point Distribution Tensor
IARIS Workshop on Computer Vision and Computer Graphics (VisGra), 2012
IARIS Workshop on Computer Vision and Computer Graphics (VisGra), 2012
inproceedings Distributed Load Balancing for Parallel Agent-Based Simulations
International Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP), 2011, 62–69
Acceptance rate: 39% (29/74)
International Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP), 2011, 62–69
Acceptance rate: 39% (29/74)
incollection Visualization Methods for Numerical Astrophysics
Chapter in Astrophysics. InTech - Open Access Publisher, 2011, 259–286
Chapter in Astrophysics. InTech - Open Access Publisher, 2011, 259–286
phdthesis Efficient Distributed Load Balancing for Parallel Algorithms
Universita degli Studi di Salerno, Italy, 2011
Universita degli Studi di Salerno, Italy, 2011
techreport Synergy Effects of Hybrid CPU-GPU Architectures for Interactive Parallel Ray Tracing
Science and Supercomputing in Europe, Research Highlights 2009. HPC-Europa2 Technical Reports, 2009, 61
CINECA, ISBN 978-88-86037-23-5
Science and Supercomputing in Europe, Research Highlights 2009. HPC-Europa2 Technical Reports, 2009, 61
CINECA, ISBN 978-88-86037-23-5
article Experiences with Mesh-like computations using Prediction Binary
Trees
Scalable Computing: Practice and Experience, Scientific international journal for parallel and distributed computing (SCPE), 2009, 10, 173–187
Scalable Computing: Practice and Experience, Scientific international journal for parallel and distributed computing (SCPE), 2009, 10, 173–187
article SambVca: A Web Application for the Calculation of the Buried Volume of N-Heterocyclic Carbene Ligands
European Journal of Inorganic Chemistry, 2009, 1759–1766
European Journal of Inorganic Chemistry, 2009, 1759–1766
techreport Evaluation of Adaptive Subdivision Schemas for Parallel Ray Tracing
HPC-Europa: Science and Supercomputing in Europe, Technical Reports 2008, 2008, 206–216
CINECA, ISBN 978-88-86037-22-8
HPC-Europa: Science and Supercomputing in Europe, Technical Reports 2008, 2008, 206–216
CINECA, ISBN 978-88-86037-22-8
inproceedings Load Balancing in Mesh-like Computations using Prediction Binary Trees
7th International Symposium on Parallel and Distributed Computing (ISPDC), 2008, 139–146
7th International Symposium on Parallel and Distributed Computing (ISPDC), 2008, 139–146
inproceedings On Estimating the Effectiveness of Temporal and Spatial Coherence in Parallel Ray Tracing
Eurographics Italian Chapter Conference (EGITA), 2008, 97–104
Eurographics Italian Chapter Conference (EGITA), 2008, 97–104
inproceedings A Survey on Exploiting Grids for Ray Tracing
Eurographics Italian Chapter Conference (EGITA), 2008, 89–96
Eurographics Italian Chapter Conference (EGITA), 2008, 89–96