SBIR-STTR Award

A Cloud-Based WGS Platform for Establishing Phylogeny Within Epidemic Outbreaks At Hospitals
Award last edited on: 1/22/20

Sponsored Program
SBIR
Awarding Agency
NIH : NIAID
Total Award Amount
$279,815
Award Phase
1
Solicitation Topic Code
-----

Principal Investigator
Srini S Iyer

Company Information

IcBiome Inc

23292 Meadowvale Glen Court
Sterling, VA 20166
   (703) 283-7768
   N/A
   www.icbiome.com
Location: Single
Congr. District: 10
County: Loudoun

Phase I

Contract Number: 1R43AI143267-01
Start Date: 1/1/19    Completed: 6/30/20
Phase I year
2019
Phase I Amount
$279,815
Outbreaks have become a significant public threat in hospitals, impacting both patient lives and creating a financial burden. With the increasing rise in antibiotic resistance, new interventions are urgently needed to contain ongoing outbreaks. Recent studies have confirmed that whole genome sequencing (WGS) is able to identity unique mutations within each outbreak strain, which can then be utilized to establish transmission routes. However, significant bioinformatics challenges exist in utilizing WGS for outbreak analyses. WGS reads are inherently noisy, and traditional read-mapping techniques require the careful selection of quality criteria to identify and remove artefactual SNPs. This can be a challenge when often only a single-nucleotide polymorphism (SNP) may separate two outbreak isolates. This has resulted in a high barrier for routine adoption of genomics-based interventions during outbreaks. In this proposal, we look to develop a fully-automated cloud-based bioinformatics platform that can be rapidly leveraged in the event of an outbreak. Our platform will avoid current read-mapping approaches to curate erroneous assemblies and instead adopt a different methodology that utilizes new Amazon Web Services (AWS) cloud computing components such as AWS Lambda and DynamoDB. In a preliminary evaluation, we performed a manual analysis of our approach by processing raw sequence data from two hospital outbreaks (Enterococcus faecium and Escherichia coli). In both cases, our results matched the published phylogeny that was derived using read-mapping. This manual evaluation strengthens the premise of our approach since the first outbreak only had a single, non-synonymous SNP that separated the three strains involved in the outbreak. Our cloud-based diagnostics framework will be implemented in Amazon Web Services (AWS) cloud. It will be evaluated against public NCBI data from several major hospital outbreaks. Our Phase 1 benchmark is to complete all analyses for each outbreak within one hour. Our Phase 1 aims are: 1) Develop an assembly module that assembles all the raw sequence data and then identifies artefactual SNPs within each outbreak assembly; 2.) Develop a biomarker module that establishes unique biomarker sequences by removing both artefactual SNPs and low-quality SNPs; and 3.) Develop a control module that combines the results of both the assembly module and the biomarker module to establish phylogeny. During Phase 2 development, we look to formally evaluate the platform by sequencing bacterial cultures from historical outbreaks and then uploading the sequence data to our cloud bioinformatics platform.

Public Health Relevance Statement:
Hospital outbreaks are a significant public health burden. While whole genome sequencing shows promise in controlling outbreaks, significant bioinformatics challenges exist as only a few mutations separate outbreak strains. Current methodologies for establishing these biomarker mutations require a complex set of tools and approaches that often need manual validation. In this proposal, a different approach is outlined for curating erroneous assemblies and establishing outbreak phylogeny. Our bioinformatics platform will be cloud-based and can be rapidly leveraged by any hospital that has an ongoing outbreak. Our phylogeny results will allow infection control to identify the transmission routes between the outbreak strains and take corrective actions.

Project Terms:
Adopted; Adoption; Antibiotic Resistance; Award; base; Benchmarking; Bioinformatics; Biological Markers; Clinical; cloud based; Cloud Computing; cloud platform; Cloud Service; Complex; cost; Data; Data Analytics; data hub; data integration; Development; Diagnostic; Disease Outbreaks; Ensure; Enterococcus faecium; Epidemic; Escherichia coli; Evaluation; Event; Financial Hardship; genome sequencing; Genomic Segment; Genomics; Hospitals; Hour; indexing; Individual; Infection Control; innovation; Intervention; Manuals; Methodology; Microbiology; Mutation; National Institute of Allergy and Infectious Disease; parallel computer; parallel processing; pathogen genomics; pathogenic bacteria; Patients; Phase; Phylogeny; Plasmids; Play; preservation; Process; Public Health; Publishing; Raja; reference genome; Role; Route; sequencing platform; Series; Side; Single Nucleotide Polymorphism; Small Business Innovation Research Grant; System; Techniques; Time; tool; transmission process; United States National Institutes of Health; Validation; Visualization software; web services; whole genome

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----