SBIR-STTR Award

Productive Large Scale Personal Computing: Fast Multipole Methods on GPU/CPU Systems
Award last edited on: 1/14/2016

Sponsored Program
SBIR
Awarding Agency
NASA : ARC
Total Award Amount
$70,000
Award Phase
1
Solicitation Topic Code
S8.02
Principal Investigator
Nail A Gumerov

Company Information

Fantalgo LLC

7496 Merrymaker Way
Elkridge, MD 21075
   (301) 332-2507
   ramani.d@gmail.com
   www.fantalgo.com
Location: Single
Congr. District: 02
County: Howard

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2007
Phase I Amount
$70,000
To be used naturally in design optimization, parametric study and achieve quick total time-to-solution, simulation must naturally and personally be available to the scientist/engineer, as easily as email or word-processing. Environments such as Matlab/IDL allow ease of use, but unless simulations are extremely fast, they cannot be used naturally. Many large-scale numerical calculations require storage and computation that grow as the square/cube of the number of variables, including such linear algebra operations as solving dense linear systems, computing eigen-values/vectors, and others. The use of fast algorithms such as the fast multipole method (FMM) coupled with iterative methods allows many problems of interest to be solved in near linear time and memory. We have taken a leadership role in applying and extending the FMM to various problems in acoustics, fluid flow, electromagnetics, function fitting and machine learning. Graphical Processing Units (GPUs) are now ubiquitous in game consoles, in workstations and other devices and are special purpose processors for graphics, that are predicted to shortly achieve performance in the hundreds of gigaflop range for specialized calculations (much faster than COTS PCs) at low price points. It is conceivable now to equip personal workstations with several CPUs and GPUs, and solve problems with millions or billions of variables quickly using fast algorithms. We will take an important algorithm with wide applicability: the FMM, and implement it on the widely available heterogeneous CPU/GPU architecture, and prove the feasibility of accelerating it tremendously. A fundamental reconsideration of the algorithm that maps appropriate pieces on to the correct part of the architecture forms the basis of our approach. Developed software will be tested, and benchmark problems solved. A library of software that will support the porting of the FMM and other scientific computing to the CPU/GPU architecture will be developed.

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----