SBIR-STTR Award

Automated Data Transformations for Net-Centric Operations
Award last edited on: 4/7/2010

Sponsored Program
SBIR
Awarding Agency
DOD : AF
Total Award Amount
$849,981
Award Phase
2
Solicitation Topic Code
AF083-036
Principal Investigator
Steven N Minton

Company Information

Fetch Technologies (AKA: Connotate Solutions~Dynamic Domain )

841 Apollo Street Suite 400
El Segundo, CA 90245
   (310) 414-9849
   info@fetch.com
   www.fetch.com
Location: Single
Congr. District: 33
County: Los Angeles

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2009
Phase I Amount
$99,988
In this project Fetch Technologies will design, prototype and evaluate a new approach to transforming and normalizing data from multiple heterogeneous sources.  In previous work, we developed and successfully commercialized a system for creating transformation pipelines.   In a transformation pipeline, a new source (with its own unique schema) can be dropped into the pipeline, and as long as the sources data schema satisfies some very general constraints on the type of data present, then the pipeline will successfully normalize data from that source.  Our objective is to design the next generation of this system, which we call AutoTrans, that will minimize the human effort necessary build a robust transformation pipeline. In particular, through the use of machine learning techniques, the AutoTrans system will make it easier and more automatic to configure and modify a series of transformations.   It will also provide actionable results even when the existing set of recognizers and mappings is incomplete. Finally, the system will be able to represent and reason about the correctness/fidelity of the transformed data.  

Benefit:
The aim of this project is to create a transformation system that minimizes the human effort necessary to aggregate data from multiple heterogeneous systems. Currently, integrating information from multiple domains and applications is technically challenging.  Using existing transformation design systems is difficult because the transformations generally have to be designed by knowledgeable programmers. They are often one-to-one mappings, which must be modified or redesigned when a new data source needs to be integrated.  Our approach represents an advance for data aggregation problems, because it allows one to implement a data pipeline that can normalize data from a wide variety of sources without reprogramming.   The new AutoTrans technology represents the next generation of this approach.  It will markedly decrease the human time and the skill-level required to develop and maintain these powerful pipelines. This in turn will produce a qualitative difference in how broadly this technology can be applied in commercial and military systems.

Keywords:
Information Integration, Machine Learning, Data Cleaning, Information Extraction

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
2010
Phase II Amount
$749,993
In this project Fetch Technologies will implement and evaluate a new approach to transforming and normalizing data from multiple heterogeneous sources. In previous work, Fetch Technologies developed and successfully commercialized a system for creating “transformation pipelines”. In a transformation pipeline, a new source (with its own unique schema) can be “dropped” into the pipeline, and as long as the sources’ data schema satisfies some very general constraints on the type of data present, then the pipeline will successfully normalize data from that source. Our objective is to design the next generation of this system, called AutoTrans, that will minimize the human effort necessary build a robust transformation pipeline. In particular, through the use of machine learning techniques, AutoTrans system will make it easier and more automatic to configure and modify a series of transformations. It will result in pipelines sequence that are robust even when the sequence of transformations is potentially incomplete or there is uncertainty in the data. BENEFIT

Keywords:
Machine Learning, Information Integration, Artificial Intelligence, Information Extraction, Normalization Of Data