DOE 2022 From Lab to Algorithm: Cloud-based Biological Data Preparation, Tracking, and Checking for AI-readiness

From Lab to Algorithm: Cloud-based Biological Data Preparation, Tracking, and Checking for AI-readiness
Award last edited on: 9/5/22

Awarding Agency

DOE

Total Award Amount

$249,878

Award Phase

Solicitation Topic Code

C53-01a

Principal Investigator

Anastasia Deckard

Geometric Data Analytics

636 Rock Creek Road
Chapel Hill, NC 27514

(919) 448-7871

N/A

www.geomdata.com

Location: Single
Congr. District: 04
County: Orange

Phase I

Contract Number: DE-SC0022400
Start Date: 2/14/22 Completed: 2/13/23

Phase I year

2022

Phase I Amount

$249,878

For large scale analysis of biological systems, moving from data to analysis to interpretable results is generally very slow and failures are not discovered until the end of the process. Complicated, evolving data is run through a variety of changing analysis scripts, without standardization or provenance tracking, which causes reproducibility issues. These issues have negative impact on the quality, speed, and cost of biological data analysis projects. We propose a software system to address these issues consistently, flexibly, and proactively that will produce analysis-ready or AI/ML-ready biological data and metadata. To make analysis faster and easier, we propose a cloud-based, microservice architecture that provides services and pipelines for users. To reduce and find data issues, there will be services for standardizing data, geometrical/statistical analysis, and data summarization tools. To increase the reproducibility of results, we propose a system of tracking data provenance, algorithm version tracking, and data identifiers. In Phase 1 we propose to build the cloud-based system of microservices that are executed by configurable pipelines and track provenance and versioning. We plan to provide a set of initial services that include standardizing data, metadata, and QC/QA data; identifying potential data issues; quantifying performance issues; and locating the sources of issues. We will begin working with two data types, transcriptomics and proteomics, while keeping the system flexible to add more types later. This system would be used by medium to large corporations and government research labs that work in applied areas such as biomedical, pharmaceutical, or agbio. These researchers are working on topics such as diagnostics, synthetic biology, and drug development and are processing genomics, transcriptomics, and proteomics data. Decreasing the time from experiment to results creates an economic benefit for both companies and the public. For researchers, they spend less and get results faster, which saves them both money and time. These savings can also benefit the public, as they receive new products more quickly. Detecting issues in the data increases the correctness of the company's results, improves the quality of the product for consumers, and improves consumersâ confidence in the company

Phase II

Contract Number: ----------
Start Date: 00/00/00 Completed: 00/00/00

Phase II year

----

Phase II Amount

----

SBIR-STTR Award

From Lab to Algorithm: Cloud-based Biological Data Preparation, Tracking, and Checking for AI-readiness
Award last edited on: 9/5/22

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Geometric Data Analytics

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

New To Inknowvation.com?

SBIR-STTR Award

From Lab to Algorithm: Cloud-based Biological Data Preparation, Tracking, and Checking for AI-readinessAward last edited on: 9/5/22

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Geometric Data Analytics

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

From Lab to Algorithm: Cloud-based Biological Data Preparation, Tracking, and Checking for AI-readiness
Award last edited on: 9/5/22