DARPA 2023 Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)

Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)
Award last edited on: 4/28/2024

Awarding Agency

DOD : DARPA

Total Award Amount

$1,791,605

Award Phase

Solicitation Topic Code

HR0011SB20234-02

Principal Investigator

Brian Weigel

Cynnovative LLC

4075 Wilson Boulevard Suite 800
Arlington, VA 22203

(856) 630-8984

N/A

www.cynnovative.com

Location: Single
Congr. District: 08
County: Arlington

Phase I

Contract Number: 2023
Start Date: ---- Completed: 8/14/2023

Phase I year

2023

Phase I Amount

Direct to Phase II

Phase II

Contract Number: N/A
Start Date: 8/14/2026 Completed: 8/14/2023

Phase II year

2023
(last award dollars: 1714307163)

Phase II Amount

$1,791,604

Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL) will generate realistic synthetic data without artifacts at scale by utilizing hierarchical reinforcement learning and a hypervisor to allow for off-box execution of long-term goals, mid-term tasks, passed through a shim to a hypervisor that will execute them on the intended host. Team Cynnovative will use hierarchical reinforcement learning to simulate user behavior at the level a real user would: keyboard and mouse activity and observing a monitor. By simulating on real hardware and executing off-box, WHIRL enables the collection of the generated synthetic data via any traditional means a WHIRL user desires without introducing any artifacts or biases. User persona research for SUP is a method of understanding the characteristics and behavior patterns of specific groups of users to gain insights into the motivations, goals, and needs of these user groups and inform the design and development of effective cybersecurity strategies. This research seeks to understand how users interact with network systems, applications, and data to design policies that enable a user to operate successfully while maintaining a robust security posture. The autonomous agent for WHIRL is rewarded for taking actions to achieve a goal, such as browsing the web or using an excel sheet or operating in a terminal. Feedback from the environment informs the agent how well it accomplishes the task. A fully trained agent can act as a defined synthetic user enables the generation and collection of robust datasets representative of realistic user behavior. Team Cynnovatives solution will be operating on the raw pixels of a screen capture which puts reinforcement learning in a real-world domain with an observation space where it has succeeded in the past, effectively eliminating the simulation to real-world problems. Reinforcement learning can operate on pixel data and elicit realistic, emergent behaviors with ground truth. WHIRL operates off-box on real hardware (via hypervisor), enabling the collection of synthetic data the way the data is normally collected. This means that the users of the WHIRL system will not have to worry about learning how to collect via another platform but will enable them to leverage existing knowledge and tools and not worry about filtering for any artifacts or biases.

SBIR-STTR Award

Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)
Award last edited on: 4/28/2024

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Cynnovative LLC

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

New To Inknowvation.com?

SBIR-STTR Award

Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)Award last edited on: 4/28/2024

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Cynnovative LLC

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

Workable Hierarchical Impersonation using Reinforcement Learning (WHIRL)
Award last edited on: 4/28/2024