SBIR-STTR Award

The NamesforLife Semantic Index of Phenotypic and Genotypic Data for Systems Biology
Award last edited on: 6/10/2021

Sponsored Program
STTR
Awarding Agency
DOE
Total Award Amount
$2,084,833
Award Phase
2
Solicitation Topic Code
34 b
Principal Investigator
George M Garrity

Company Information

NamesForLife LLC

University Place Suite 202 333 Albert Avenue
East Lansing, MI 48823
   (517) 214-8821
   garrity@namesforlife.com
   www.names4life.com

Research Institution

----------

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2011
Phase I Amount
$100,000
The DOE Systems Biology Knowledgebase (Kbase) was envisioned to provide a framework to support modeling of dynamic cellular processes of microorganisms, plants and metacommunities. The Kbase will provide the tools and data to permit rapid iteration of experiments that draw on a variety of data types and allow endusers to infer how cells and communities respond to natural or induced perturbations, and ultimately to predict outcomes. The Systems Biology Knowledgebase Implementation Plan defines the needs and priorities for this initiative, which include biofuel production, bioremediation and carbon sequestration. Ultimately, the Kbase will provide a platform for accelerated acquisition of basic and applied biological knowledge. Predictive models depend on high quality input data. The authors of the Implementation Plan recognize that many different types of data are required to build such models. But not all data are of similar quality nor are all of the data amenable to computational analysis without extensive cleaning, interpretation and normalization. Key among those needed to make the Kbase fully operational arephenotypic data, which are more complex than sequence data, occur in a wide variety of forms, often use complex and nonuniform descriptors and are scattered about, principally in the scientific and technical literature or in specialized databases. Incorporating these data into the Kbase will require expertise in harvesting, modeling and interpreting the data. The NamesforLife Semantic Index of Phenotypic and Genotypic Data for Systems Biology seek to address this problem by taking the first steps toward ontology of phenotypes for Bacteria and Archaea, based on the taxonomic literature. Phase I of this project will create a draft vocabulary of phenotypic features that will enable integration of normalized phenotypic data into the Kbase in Phase II. This project builds on the N4L technology of NamesforLife, LLC. Commercial Applications and Other

Benefits:
The Company’s data and applications bring enhanced accuracy and clarity of meaning to the life sciences and provide new methods of searching, indexing and abstracting scientific and technical literature

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
2012
(last award dollars: 2014)
Phase II Amount
$1,984,833

The DOE Systems Biology Knowledgebase (Kbase) was envisioned to provide a framework to support modeling of dynamic cellular processes of microorganisms, plants and metacommunities. The Kbase will provide the tools and data to permit rapid iteration of experiments that draw on a variety of data types and allow end-users to infer how cells and communities respond to natural or induced perturbations, and ultimately to predict outcomes. The Systems Biology Knowledgebase Implementation Plan defines the needs and priorities for this initiative, which include biofuel production, bioremediation and carbon sequestration. Ultimately, the Kbase will provide a platform for accelerated acquisition of basic and applied biological knowledge. Predictive models depend on high quality input data. The authors of the Implementation Plan recognize that many different types of data are required to build such models. But not all data are of similar quality nor are all of the data amenable to computational analysis without extensive cleaning, interpretation and normalization. Key among those needed to make the Kbase fully operational are phenotypic data, which are more complex than sequence data, occur in a wide variety of forms, often use complex and non-uniform descriptors and are scattered about, principally in the scientific and technical literature or in specialized databases. Incorporating these data into the Kbase requires expertise in harvesting, modeling and interpreting the data.The NamesforLife Semantic Index of Phenotypic and Genotypic Data for Systems Biology seeks to address this problem by taking the first steps toward an ontology of phenotypes for Bacteria and Archaea, based on the existing taxonomic literature. In the Phase I project the Company developed software that was subsequently used to extract a list of over 40,000 candidate terms from the taxonomic literature that was used to describe 5,750 type strains of Bacteria and Archaea. The Company is currently developing a reduced subset of terms that will serve as a draft vocabulary of phenotypic features and enable integration of normalized phenotypic data into the Kbase in Phase II. The work done during the Phase I project significantly extends the core technology of NamesforLife, LLC and allows the Company to work with terminologies other than biological names. In the Phase II study, we propose to deliver a set of normalized terms that can be used to describe phenotypic features in a more consistent and accurate manner and provide direct access to existing resources (e.g., PubChem) where relevant information is available, but not in a readily accessible form. We also propose to apply our proprietary document annotation and semiotic indexing technology to produce a rich Open Access resource of phenotypic information that is accessible to both humans and machines in a variety of forms. Commercial applications and other benefits The Company’s data and applications bring enhanced accuracy and clarity of meaning to the life sciences and provide new methods of searching, indexing and abstracting scientific and technical literature.