Phase II Amount
$1,414,479
CCRi will enhance the utility of large-scale, geographically separated, semantic datasets by developing distributed capabilities for level 1 (entity resolution) and level 2 (inference) data fusion. CCRi will extend the model training server developed in Phase I, as well as the advanced techniques for concept extraction and visualization of large-scale semantic datasets developed during a parallel effort, to support streaming data, Map/Reduce model training, and integration with enriched data providers, as well as simplified model training. Models which incorporate temporal information will enable advanced predictive capabilities for relationships and concepts over time, which CCRi will investigate during Phase II. CCRi's primary focus in Phase II will be on the fusion of multiple models trained independently on distinct data sources, enabling model sharing without full data sharing, and fusion across clouds.
Benefit: The techniques investigated in Phase II, and the software which implements them, will enable insights obtained from automated analysis on isolated datasets to be combined across datasets. Level 1 (entity resolution) and Level 2 (inference) data fusion on large-scale semantic datasets, combined with advanced concept extraction and visualization, will enable operators to quickly obtain both broad and specific understanding of large relational datasets, as well as the ability to apply advanced predictive capabilities on this data.
Keywords: Distributed Architecture, cloud computing, inference, Concept Extraction , relational learning, entity resolution, data fusion, Visualization