DARPA 2017 Methods for Signal Isolation Across Heterogeneous Document Sets

Detecting Systematic Intervention in Online Discourse in or to Gain Insight onto Government Priorities
Award last edited on: 5/16/2023

Awarding Agency

DOD : DARPA

Total Award Amount

$599,068

Award Phase

Solicitation Topic Code

SB162-003

Principal Investigator

Patrick Lam

Thresher Ventures LLC

841 Elm Street Suite 333
Mclean, VA 22101

(703) 623-5590

info@thresher.io

www.thresher.io

Location: Single
Congr. District: 08
County: Fairfax

Phase I

Contract Number: W911NF-17-P-0003
Start Date: 12/23/2016 Completed: 6/30/2017

Phase I year

2017

Phase I Amount

$99,565

Text is a treasure trove of data for social, behavioral and economic researchers, for which the rapidly developing field of automated text analysis has produced numerous methods. However, the machine classifiers on which many of these methods depend degrade as data becomes noisier. Ad hoc approaches to reducing noisesuch as using analyst-generated keywords to collect documentscan introduce bias. Drs. King, Lam and Roberts (2016) work on computer-assisted keyword recommendations offered a new solution to this problem. Thresher used that work to build a multilingual keyword recommender (MKR) tool, which our government clients are using to reduce bias in data collection from relatively small sets of text. We propose to expand the applicability of the King et al. algorithm to larger datasets and longer documents by generating methods for isolating signal in text across heterogenous document sets. If successful we will reduce the need for arduous hand-coding custom data-cleaning rules for document sets before machine classifiers are applied and improve the precision and recall of the classifiers once applied. Results from this program will generate methods for Phase II that will serve as the basis for prototyping extensible software packages for deployment in MKR and other text analysis tools.

Phase II

Contract Number: W911NF-18-C-0010
Start Date: 1/30/2018 Completed: 1/31/2019

Phase II year

2018

Phase II Amount

$499,503

As authoritarian leaders take action to control online content, they leave behind an outline of their priorities and prove a willingness to act. When collected and analyzed at scale, online intervention can be used to assess the concerns of the intervening government across several policy categories, including military affairs and foreign policy. In our Phase I equivalent research, we established a replicable methodology and designed a basic software architecture for identifying and collecting content removed by hand by foreign government censors. In Phase II we propose to develop a methodology for collecting official and unofficial fabricated online content a foreign country that is correlated to real world events. We will look to mine these content streams for policy-relevant information as well as regional patterns. The Phase II Option proposes expanding at least one of the methodologies for collecting manipulated content from a second country where our research indicates the government employs certain similar media manipulation methods to the first country for study.

SBIR-STTR Award

Detecting Systematic Intervention in Online Discourse in or to Gain Insight onto Government Priorities
Award last edited on: 5/16/2023

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Thresher Ventures LLC

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

New To Inknowvation.com?

SBIR-STTR Award

Detecting Systematic Intervention in Online Discourse in or to Gain Insight onto Government PrioritiesAward last edited on: 5/16/2023

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Thresher Ventures LLC

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

Detecting Systematic Intervention in Online Discourse in or to Gain Insight onto Government Priorities
Award last edited on: 5/16/2023