SBIR-STTR Award

SMART-PETSc: Smart Middleware for Accelerating PETSc
Award last edited on: 9/5/22

Sponsored Program
SBIR
Awarding Agency
DOE
Total Award Amount
$256,500
Award Phase
1
Solicitation Topic Code
C53-02a
Principal Investigator
Donglai Dai

Company Information

X-ScaleSolutions LLC

750 Deer Run Drive
Columbus, OH 43230
   (614) 316-4209
   contactus@x-scalesolutions.com
   www.x-scalesolutions.com
Location: Single
Congr. District: 03
County: Franklin

Phase I

Contract Number: DE-SC0022423
Start Date: 2/14/22    Completed: 2/13/23
Phase I year
2022
Phase I Amount
$256,500
The efficient parallelization of algebraic solvers for partial differential equations, as exemplified in PETSc, requires nearest neighbor ghost-point communication (for parallel function evaluations, sparse matrix-vector products, and preconditioner applications) and global reductions (e.g., inner products). The ghost-point communication involves non-contiguous memory access and hence hardware or software-based packing and unpacking of vector entries. For scalability, the global reductions must be non-blocking to allow the use of pipelined Krylov methods. The major challenge to utilize GPU systems in an MPI environment for algebraic solvers is simultaneously using all available hardware resources, specifically instruction scheduling (i.e., kernel launches), computation, and communication. Achieving this requires MPI support for streams (to overlap MPI kernel launches with communication and computation), efficient non-contiguous memory access nearest neighbor collectives, and GPU efficient non-blocking global reductions. This project will be led by X-ScaleSolutions, in collaboration with Dr. Todd Munson (Argonne National Laboratory and the leader of the PETSc team), Dr. Victor Eijkhout (Texas Advanced Computing Center and a member of the PETSc team) and Dr. Sameer Shende and Dr. Allen Malony (ParaTools, Inc. and the leaders of the TAU team). To address the severe challenge described above, X-ScaleSolutions proposes to design and develop “SMART- PETSc” a Smart Middleware for Accelerating PETSc on modern high-performance computing hardware. We will work along the following directions: 1) developing optimized datatype processing techniques for intra-/inter-node transfers using kernel-fusion novel and advanced capabilities of emerging hardware, 2) designing optimized methods to overlap kernel launch and kernel execution time to enable simultaneous use of all available hardware resources, 3) co-designing PETSc and MVAPICH using the MPI T interface to support user specified GPU streams, 4) supporting in-network collective communication for high-performance and high-overlap non-blocking CPU-/GPU-based collectives, 5) supporting in- network collective communication for high-performance neighborhood collectives, 6) enabling full stack observability and explainability through tight integration with the TAU Performance System using the MPI T interface, 7) measuring the impact of the proposed designs on PETSc end applications, and 8) conducting systematic test and evaluation to ensure proper integration. Tasks 1, 2, parts of 3 and 6, relevant portion of 8 will be carried out as part of Phase-1 activities. The transformative impact of the proposed SMART- PETSc product will be to enable many HPC and DL frameworks/applications that use PETSc APIs to take advantage of the novel features and capabilities on emerging hardware technologies and reap the additive benefits of co-design and tighter integration between PETSc and MVAPICH libraries. We expect that the solutions SMART-PETSc will provide can reduce the communication overhead by up to 5x. This can lead to a significant boost in performance and scalability of many HPC and DL frame- works/applications that use PETSc APIs. The proposed collaboration will not only enable DOE Labs to use the SMART-PETSc middleware on upcoming exascale systems, but also provide a commercial version of this middleware for use by supercomputing centers and cloud providers worldwi

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----