SBIR-STTR Award

E4S: Extreme-Scale Scientific Software Stack for Commercial Clouds
Award last edited on: 9/5/22

Sponsored Program
SBIR
Awarding Agency
DOE
Total Award Amount
$250,000
Award Phase
1
Solicitation Topic Code
C53-02b
Principal Investigator
Nicholas Chaimov

Company Information

ParaTools Inc

1900 Millrace Drive Suite 104 Mailbox #1
Eugene, OR 97405
   (541) 913-8797
   info@paratools.com
   www.paratools.com
Location: Single
Congr. District: 04
County: Lane

Phase I

Contract Number: DE-SC0022502
Start Date: 2/14/22    Completed: 2/13/23
Phase I year
2022
Phase I Amount
$250,000
The software used in High Performance Computing (HPC) and Artificial Intelligence/Machine Learning (AI/ML) workloads is increasingly complex to maintain, install, and optimize. More problematic is the poor performance portability of applications between platforms, forcing site-specific re-engineering of codes. Existing solutions to deployment of AI/ML workflows on commercial cloud environments are platform- specific, preventing migration from one cloud provider to another. This project proposes to address the problem by combining the use of E4S, which provides multi-platform container images, with MVAPICH2, a highly-performant and performance-portable MPI library for fast, inter-and intra- node communication on AWS and other commercial cloud platforms. Phase I will evaluate the feasibility of this solution and build prototypes for evaluation. We will evaluate the use of MVAPICH2 to provide high-performance deployments of MPI applications on cloud platforms; build high-performance versions of commonly used Deep Learning frameworks for cloud deployment; make use of high-speed network adapters and GPUs within the cloud environments; and evaluate the creation of a web interface for one-click deployment of highly performant Deep Learning applications. The success of our Phase I project will deliver a productive platform for transitioning important HPC applications (many developed in DOE national laboratories) to more accessible cloud based HPC platforms in a portable manner while retaining high performance. It will be beneficial to practically all scalable HPC applications ranging from modeling and simulation to AI/ML, where advance message communication hardware and access to accelerator technologies are being more commonly supported in commercial cloud systems. In particular, data analytics and deep learning are areas of high growth and of benefit to a broad range of industries. High performance is critical for these codes — a poorly performing code wastes compute resources, preventing purchased hardware from being used for other uses, increasing a business’s costs for cloud computing resources, and increasing time to solution. This project will especially benefit the deep learning market by making deployment of applications on cloud platforms easier, facilitating portability between cloud platforms while maintaining performance, and reducing training time for deep learning models. Efficient use of pay-per-core-hour resources like public clouds reduces costs to users along with energy consumpt

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----