SBIR-STTR Award

Detecting Systematic Intervention in Online Discourse in or to Gain Insight onto Government Priorities
Award last edited on: 5/16/2023

Sponsored Program
SBIR
Awarding Agency
DOD : DARPA
Total Award Amount
$599,068
Award Phase
2
Solicitation Topic Code
SB162-003
Principal Investigator
Patrick Lam

Company Information

Thresher Ventures LLC

841 Elm Street Suite 333
Mclean, VA 22101
   (703) 623-5590
   info@thresher.io
   www.thresher.io
Location: Single
Congr. District: 08
County: Fairfax

Phase I

Contract Number: W911NF-17-P-0003
Start Date: 12/23/2016    Completed: 6/30/2017
Phase I year
2017
Phase I Amount
$99,565
Text is a treasure trove of data for social, behavioral and economic researchers, for which the rapidly developing field of automated text analysis has produced numerous methods. However, the machine classifiers on which many of these methods depend degrade as data becomes noisier. Ad hoc approaches to reducing noisesuch as using analyst-generated keywords to collect documentscan introduce bias. Drs. King, Lam and Roberts (2016) work on computer-assisted keyword recommendations offered a new solution to this problem. Thresher used that work to build a multilingual keyword recommender (MKR) tool, which our government clients are using to reduce bias in data collection from relatively small sets of text. We propose to expand the applicability of the King et al. algorithm to larger datasets and longer documents by generating methods for isolating signal in text across heterogenous document sets. If successful we will reduce the need for arduous hand-coding custom data-cleaning rules for document sets before machine classifiers are applied and improve the precision and recall of the classifiers once applied. Results from this program will generate methods for Phase II that will serve as the basis for prototyping extensible software packages for deployment in MKR and other text analysis tools.

Phase II

Contract Number: W911NF-18-C-0010
Start Date: 1/30/2018    Completed: 1/31/2019
Phase II year
2018
Phase II Amount
$499,503
As authoritarian leaders take action to control online content, they leave behind an outline of their priorities and prove a willingness to act. When collected and analyzed at scale, online intervention can be used to assess the concerns of the intervening government across several policy categories, including military affairs and foreign policy. In our Phase I equivalent research, we established a replicable methodology and designed a basic software architecture for identifying and collecting content removed by hand by foreign government censors. In Phase II we propose to develop a methodology for collecting official and unofficial fabricated online content a foreign country that is correlated to real world events. We will look to mine these content streams for policy-relevant information as well as regional patterns. The Phase II Option proposes expanding at least one of the methodologies for collecting manipulated content from a second country where our research indicates the government employs certain similar media manipulation methods to the first country for study.