This project will develop and test a publicly available web interface for executing and monitoring genomic data analysis pipelines described by the Workflow Description Language. The interface will be hosted on Truwl"TM" (https://truwl.com), xD Bio's"TM" community-oriented genomic data analysis methods sharing web application. Executing pipelines will leverage the reproducibility framework developed by researchers at the Encyclopedia of DNA Elements (ENCODE) consortium Data Coordinating Center which simplifies running pipelines in cloud environments. The project aims to make analysis methods accessible to and usable to the genomics community and enable biomedical researchers with limited or no computation expertise to analyze genomic data easily. Briefly, Truwl will be extended to contain a new web interface with dynamic forms for specifying pipeline inputs, a backend service will be created execute and monitor compute jobs, and Google Cloud Platform compute instances will be created and configured with the ENCODE reproducibility framework to execute compute jobs. The ENCODE Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) pipeline will be used as a test case for successfully executing pipelines from the new web interface. A range of datasets and input parameters will be used to test the capabilities of the system. This will demonstrate running an ENCODE pipeline from a publicly available web interface for the the first time and provide a pattern for making any well-described genomic data analysis pipeline widely available and usable to the biomedical research community.
Public Health Relevance Statement: Narrative Genomics is key to understanding and biology and disease. This project will contribute to understanding of how the genome works by expanding the capabilities of biomedical researchers to effectively analyze genomic data. Terms: