SBIR-STTR Award

An Easy-To-Use and Powerful Tool for Improved Rosetta Comparative Modeling of Proteins
Award last edited on: 3/23/2017

Sponsored Program
SBIR
Awarding Agency
NIH : NIGMS
Total Award Amount
$224,908
Award Phase
1
Solicitation Topic Code
-----

Principal Investigator
Yifan Song

Company Information

Cyrus Biotechnology Inc (AKA: Cyrus Biotech)

500 Union Street Suite 320
Seattle, WA 98109
   (206) 258-6561
   info@cyrusbio.com
   www.cyrusbio.com/
Location: Single
Congr. District: 07
County: King

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2016
Phase I Amount
$224,908
In this project we aim to improve on the state-of-the-art homology modeling pipeline in the Rosetta software package, strengthen its capability in modeling large proteins and structure refinement with near-atomic resolution density data and sparse NMR data. We will also develop a graphical user interface (GUI) and an easy-to-use backend that will allow a user to easily set up modeling tasks and access large amounts of computing. The success of this project will facilitate drug design and other applications requiring accurate computational models. It will also provide accuracy estimations to inform the user's trust in the output model. The software developed here establishes a framework in which both academic and commercial users without an extensive computational background or the time to learn a complex new command-line tool can interact with the Rosetta modeling software package via a GUI. The three overlapping areas to be investigated here are: 1. Improving homology modeling methods. We will further develop the broken chain kinematics system incorporated in RosettaCM and benchmark against a large dataset collected from previous CASP and CAMEO experiments. A more hierarchical kinematics system will be developed for modeling with known contact information and tested using a dataset with sparse NMR data. 2. Graphical user interface (GUI). One of the challenges of using a modeling software package such as Rosetta is that it requires a large amount of prior training in computer science, including basic Linux skills, software compilation, simple scripting and tabular data manipulation. We will develop a powerful GUI so that interactions with the Rosetta software package become much easier, without sacrificing user control over key aspects of modeling. 3. Cloud computing. Large computing resources are necessary to achieve massive amounts of sampling during structure modeling, and this often leads to more accurate results. However to the access to such computing resources is scarce and expensive to deploy locally. By developing a deployment mechanism that is deliverable on both a local cluster or via cloud computing, we can ensure that tasks that require large amounts of sampling are easily accessible to all scientific users.

Public Health Relevance Statement:


Public Health Relevance:
This project addresses several challenges in making Rosetta protein modeling package more general and accessible. While Rosetta has shown its capability in many structural biology tasks, its flexibility in applying to a larger range of tasks outside of academic setting is a major challenge. We will develop RosettaCM to be more interactive, and improve the kinematics system used in RosettaCM. By adding a user interface and cloud compute capability, users can easily add input data, adjust parameters and access large amount of computational power. With this project, additional features will be continuously added so that both academic and commercial users can access other applications of Rosetta modeling and design software package.

Project Terms:
Address; Algorithms; Antibodies; Area; base; Benchmarking; Cloud Computing; comparative; Complex; computer science; Computer Simulation; Computer software; computing resources; Data; Data Set; density; Development; Docking; Drug Design; drug discovery; Drug Targeting; electron density; Ensure; flexibility; Goals; graphical user interface; High Performance Computing; Homology Modeling; improved; kinematics; Knowledge; Lead; Learning; Linux; Methodology; Methods; model design; Modeling; Motion; Occupations; Output; Performance; Phase; Positioning Attribute; Process; protein structure; protein structure prediction; Proteins; public health relevance; Relative (related person); research study; Resolution; restraint; Sampling; Scheme; Scientist; Sequence Alignment; skills; small molecule; software development; structural biology; Structure; success; System; Testing; Time; tool; Training; Trust; Validation

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----