SBIR-STTR Award

Refactor++: Automated Support for Program Enhancements
Award last edited on: 9/16/2013

Sponsored Program
SBIR
Awarding Agency
DOE
Total Award Amount
$1,149,869
Award Phase
2
Solicitation Topic Code
-----

Principal Investigator
Ira D Baxter

Company Information

Semantic Designs Inc

8101 Asmara Drive
Austin, TX 78750
   (512) 250-1018
   info@semanticdesigns.com
   www.semdesigns.com
Location: Single
Congr. District: 31
County: Williamson

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2012
Phase I Amount
$149,869
C++ is a key software technology for programming embedded systems and sophisticated applications, widely used for mathematical modeling codes fundamental to modern physics and engineering. Such codes are complex, often requiring high performance, and are built over long periods as scientists come and go. A significant problem that delays obtaining results for science applications is the scientist & apos;s task of understanding and modifying such applications. Often, functional changes to the code require structural modifications, commonly called refactorings. C++ is notoriously difficult to manipulate mechanically. Existing C++ refactoring tools are unreliable and have inappropriate functionality for use with C++ modeling codes. Thus they are unusable in practice for these applications. In this SBIR proposal, Semantic Designs (SD) will develop Refactor++, an effective C++ refactoring tool using SD & apos;s C++ front end and its DMS program transformation foundation. Refactor++ will operate on large, complex, real C++ modeling codes supporting all facilities of C++ including preprocessing, the proposed C++0x standard, OpenMP and MPI libraries. Analysis and transformation of multi-million line C++ code bases demands considerable random-access storage and is computationally expensive. SD will research and implement state of the art global flow analysis algorithms relevant to C++0x and will develop parallelism constructs, that are key to supporting accurate analysis and refactorings.SD will implement a number of traditional refactorings (Rename, Move) and some focused specifically on scientific computing (Abstract to Template, Parallelize Blocks, De-clone, Infer Pre/Post conditions). Because of the synergy of flow analysis to support refactoring with program analysis, SD will provide additional program analysis tools to the scientist, from where is X used? to what computations feed into X (program slices) and why can & apos;t I parallelize A and B? SD will produce Refactor++ by integrating this analysis and refactoring capability with interactive editing under several popular editors/IDEs widely used for C++ development platforms in the physics world. Commercial Applications and Other

Benefits:
Scientists, physicists and software engineers using this tool will be able to modify and enhance their applications with less effort and higher reliability. Refactor++ will help them produce more understandable code, shorten their development cycles, and develop codes that are more readily enhanced and maintained over their long lifetimes. Scientists will be better equipped to acquire a deeper understanding of fundamental physics.

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
2012
Phase II Amount
$1,000,000
Science and engineering depend on evolving ever more complex software, typically coded in C++, to investigate phenomena or design sophisticated products. The rate of delivered value for such software is often limited by its organization, and scientists/engineers spend significant effort trying to understand the code organization and restructuring the code to enable the next interesting concept or experiment to be integrated in trustworthy fashion. This project will build Refactor++, a tool to help engineers restructure their C++ code by applying highly automated interactive support. Reliably modifying computer software is hard. We leverage a commercial pro- gram transformation system, DMS, to handle the additional difficulties of C++: complex syntax and semantics (significantly extended by the recent C++11 standards), and DMS & apos;s underlying parallel language, PARLANSE, to handle the problem caused by the growing size (millions of lines) of C++ applications. We invent code parsing algorithms to handle the pervasive problem of capturing preprocessor directives. We encode significant knowledge about C++ into program analysis tools. We build specialized refactorings for common code restructuring tasks using the analysis tools to ensure that automated code changes made by Refactor++ are reliable, so users can apply them with impunity. Many-core parallelism of burgeoning workstations will be used to reduce response times for difficult analyses of large applications, by scaling PARLANSE from 32 to 64 bits and 64 cores. To enable wide-spread adoption and therefore benefit, we leverage existing IDEs as front-ends for Refactor++. We found a novel means to parse and retain C++ preprocessor conditionals, enabling analysis tools to process the code as the user sees it; a patent is being filed. We enhanced the existing tool & apos;s C++98 front end to parse all of C++11, as well as OpenMP directives used in scientific computing. A control flow analyzer was built for C++98 to gauge the effort of building a full flow analyzer; that analyzer was harnessed to find certain types of inefficient C++ operations invisible to programmers. An initial robust Renaming refactoring was implemented for a wide variety of C++ entities. An architecture for a full generic client-server based Refactor++ tool was designed. A plan for engineering a 64 bit PARLANSE to run on both Windows and Linux systems was devised. Symbol table, control and dataflow analysis algorithms that account for preprocessor conditionals will be implemented for full C++11. A variety of useful analyzers (locate precise definition/use, find dead/duplicate code, expose potential parallelism) and refactoring actions (rename, move code entity, restructure #include nests, remove duplicated code) will be implemented on top of a structure editor embedded in a Refactoring server. A widely available client IDE (Eclipse) will be integrated with the Refactoring server. A 64 bit PARLANSE compiler ecosystem will be built using DMS & apos;s present transformation capability as the compiler foundation. Commercial Applications and Other

Benefits:
Modern science, ever more dependent on supercomputing, will produce results faster, improving US society and economic prowess. Already-ubiquitous embedded systems can provide more capability in shorter engineering windows. Parallel computing will become somewhat easier to accomplish due to engineers insight into operation of their software, and restructuring to avoid misunderstandings and coding flaws. The infrastructure supporting Refactor++ will be applicable to other computer languages, eventually benefiting the entire software development community.