SBIR-STTR Award

SCOUT: Smart Communication of Unexpected Threats
Award last edited on: 11/13/2018

Sponsored Program
STTR
Awarding Agency
DOD : Navy
Total Award Amount
$79,993
Award Phase
1
Solicitation Topic Code
N16A-T020
Principal Investigator
Vivek Dhand

Company Information

Commonwealth Computer Research Inc (AKA: CCRi)

1422 Sachem Place Unit 1
Charlottesville, VA 22901
   (434) 977-0600
   info@ccri.com
   www.ccri.com

Research Institution

University of Virginia

Phase I

Contract Number: N00014-16-P-1029
Start Date: 7/11/2016    Completed: 5/10/2017
Phase I year
2016
Phase I Amount
$79,993
The Navy needs to fuse and distill time-stamped data sources as varied as overhead imagery and Twitter feeds into actionable intelligence such as alerts, on-demand reports about entities of interest, and search capabilities. In order to enable such analytics, it is effective to learn fixed dimensional vectors (embeddings) representing the entities present in these heterogeneous data sources, which many machine learning tools require as input. While the literature on embedding knowledge graphs abounds, little work has been done to learn time-dependent embeddings for entities present in heterogeneous timestamped knowledge graphs, designed to drive predictions of what an entity is likely to do next, or for anomaly detection and alerting. We propose to address this gap by using deep sequence embedding techniques borrowed from the Computer Vision and NLP communities. In order to make our fused embedding product intelligible and tangible, we propose a data-driven report generator capable of displaying relevant known and inferred attributes for all the types of entities present in the various data sources, with minimal setup costs or requirements. We also propose a method for identifying and ranking which raw data statements most contributed to a shift in an entitys embedding, making embedding changes concrete and understandable.

Benefit:
There are two reasons that SCOUT goes beyond the capabilities of typical anomaly detection systems: first, its robust temporal sequence representation captures temporal patterns over all timescales, any of which can trigger the detection of anomalous behavior, and second, it has the ability to work with a wide range of data ranging from full motion video to the results of NLP analysis of Tweets. When combined, these two capabilities allow SCOUT to learn to expect a broader range of patterns jointly across a broader range of data sources. As temporal data becomes more pervasive with the commoditization of technology, the ability to get more value out of that data will have a greater and greater appeal. Military applications can include analysis of seagoing vessel behaviors to identify anomalies, as described in the technical volume of this proposal, as well as other scenarios in which attributes and behavior of actors can evolve over time in ways that gradually warrant more military attention. Commercial applications can include traditional anomaly detection use cases such as credit card fraud or suspicious attempts to access networked information resources, where this system could make use of all available data sources. This flexibility makes the system particularly well suited to domains where data sources are diverse, heterogeneous, and temporally grounded. For example, this sytem could track the behavior of different Internet of Things (IoT) devices from different vendors emitting different data, and detect suspicious patterns over time to help identify hacked systems. Finally, the data-driven formatting of changes and entities of interest bridges the potential disparity of input data types to provide a consistent and comprehensive user interface that can help analysts determine not only the presence of anomalous behavior, but also understand the anomalies themselves, by characterizing where they deviate from the norm and identifying the raw data statements driving the perceived change in entity behavior.

Keywords:
embeddings, embeddings, data fusion, Deep Learning, anomaly detection, representation learning, report generation, Temporal data, similarity search

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----