SBIR-STTR Award

NamesforLife Semantic Resolution Services for the Life Sciences (N4L-SRS)
Award last edited on: 12/5/2008

Sponsored Program
STTR
Awarding Agency
DOE
Total Award Amount
$849,904
Award Phase
2
Solicitation Topic Code
-----

Principal Investigator
George M Garrity

Company Information

NamesForLife LLC

University Place Suite 202 333 Albert Avenue
East Lansing, MI 48823
   (517) 214-8821
   garrity@namesforlife.com
   www.names4life.com

Research Institution

----------

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2007
Phase I Amount
$99,904
Within the Genomes-to-Life Roadmap, there is a lack of standardized semantics to accurately describe data objects and persistently express knowledge change over time. As research methods and biological concepts evolve, certainty about the correct interpretation of prior data and published results decreases, because both become overloaded with synonymous and polysemous terms. NamesforLife (N4L) is a novel technology designed to solve this problem. The core of the technology is an ontology, an XML schema, and an expertly managed vocabulary coupled with Digital Object Identifiers (DOIs), which form a transparent semantic resolution service. The service disambiguates terminologies, makes them actionable, and presents them to end-users in the correct context. A working model of the N4L technology was built to validate concepts and gain new insights into the complexities of dynamic vocabularies. In this project, the working model will be reduced to a service that can automatically annotate occurrences of names in the scientific literature and databases. The approach will: (1) transfer the current model into a more suitable environment, to simplify updating and on-the-fly generation of N4L information objects; (2) develop tagging rules to embed links from N4L information objects into on-line content; (3) enable multiple resolution through the server; (4) develop mini-monographs as an improved human interface to N4L; and (5) develop additional infrastructure to support on-the-fly translation of N4L tagged data in published content. The initial target will be the International Journal of Systematic and Evolutionary Microbiology, the publication of record for nomenclatural changes for bacteria and archaea.

Commercial Applications and Other Benefits as described by the awardee:
The N4L technology would enable end-users to spend substantially less time dealing with the ambiguity of biological names and strain identifiers and more time focused on gaining knowledge. In addition to the DOE, the technology should be useful to providers of diagnostic instrumentation and identification kits, service laboratories, and managers of commercial or public databases used for microbial identification and classification. Furthermore, the N4L model should find use in applications where a terminology and the associated concepts or objects defined by that terminology diverge over time, including medical informatics (with respect to resolution of procedural codes used in medical insurance, and in tracking the chemical and trade names of pharmaceuticals products) and manufacturing or warehousing (such as managing complex inventories of equivalent electronic and mechanical parts sourced from different manufacturers).

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
2008
Phase II Amount
$750,000
Within the Genomes-to-Life Roadmap, the DOE recognizes that a significant barrier to effective communication in the life sciences is a lack of standardized semantics that accurately describe data objects and persistently express knowledge change over time. As research methods and biological concepts evolve, certainty about correct interpretation of prior data and published results decreases because both become overloaded with synonymous (multiple terms for a single concept) and polysemous terms (single terms with multiple meanings). Ambiguity in rapidly evolving terminology is a common and chronic problem in science and technology. NamesforLife (N4L) is a novel technology designed to solve this problem. The Phase I project was based on a prototype that demonstrated that names, concepts, and the objects to which names apply must be treated independently. As proof of principle, a preliminary data model and XML schema were developed and a simple semantic resolver was deployed. In Phase I, that model was substantially refined to address limitations of the prototype. The Phase II project will extend the scope of data curation and build a framework for distributing information services to users. N4L consolidates references to different kinds of data about the same organism, tracking the state of knowledge about it over time. For selected organisms, this aspect of the technology will be applied to a range of genomic and phenotypic data deposits. N4L’s information services are made available to the user directly via the text in which they are reading related content. Phase II will achieve widespread deployment of these services, which are specific to the material being viewed by the user.

Commercial Applications and Other Benefits as described by the awardee:
N4L’s proposed data and convenient applications will bring semantic accuracy to bioinformatics practice while simultaneously enabling new business models