Region Reduction
Award last edited on: 5/23/2023

Sponsored Program
Awarding Agency
Total Award Amount
Award Phase
Solicitation Topic Code
Principal Investigator
Michael Gormish

Company Information

Clarifai Inc

115 W 30th Street Room 601
New York, NY 10001
   (866) 464-7326
Location: Single
Congr. District: 12
County: New York

Phase I

Contract Number: HM047621C0036
Start Date: 4/22/2021    Completed: 2/2/2022
Phase I year
Phase I Amount
For this proposal, Clarifai would utilize internal expertise in computer vision and deep learning to pursue a CNN to camera orientation estimation. Specifically, we would adapt existing Clarifai models to extract image features and use these features as input to a camera orientation regressor. The horizon filter information might be combined with information from the encoder to produce a joint embedding and this could inform the prediction of pitch, roll and their associated uncertainties. Alternatively, image based filtering could be used to generate camera pose, and then this information could be modulated by the horizon filter. Another item work considering is the use of transformers in the network. Transformers are used in language processing to account for the fact that order matters (but without introducing the complexity of training networks with memory, e.g. LSTMs). Their use in image processing is growing and seems especially appropriate for the pose estimation problem. The Clarifai team has extensive experience implementing and training CNN based computer vision models, understanding the potential pitfalls that arise, and can diagnose issues when the networks are not properly converging. \n\n\nClarifai is the industry's leading independent artificial intelligence company headquartered in New York City with offices in San Francisco and Washington D.C. Founded in 2013 by Dr. Matthew Zeiler, a foremost expert in machine learning, Clarifai has been a market leader since winning the top five places in the image classification challenge at the ImageNet 2013 competition. Clarifai provides powerful AI and Machine Learning (ML) image and video recognition solutions, built on the most advanced machine learning platform, empowering businesses all over the world to build the next generation of intelligent applications. Our team provides image and video recognition solutions to the Department of Defense, Intelligence Community, and commercial customers like West Elm, Trivago, and Opentable. In November 2019, Clarifai was named a visionary leader in The Forrester New Wave: Computer Vision Platforms, Q4 2019 report. Also, in November 2019, Clarifai was ranked 71th Fastest Growing Company in North America on Deloitte’s 2019 Technology Fast 500.

Phase II

Contract Number: HM047622C0018
Start Date: 4/15/2022    Completed: 7/18/2023
Phase II year
Phase II Amount
The NGA program “GLIMPSE” leverages context and topography to geolocate imagery for further analysis. For this proposal, Clarifai intends to develop and deliver a deep-learning pipeline to reduce the geographical search space for an image that will expedite analysis and reduce computational cost. The objective of this proposal is to provide a system to (a) efficiently identify relevant imagery and extract context and landmark features, and (b) support AI-assisted region reduction. Clarifai's commercial AI platform will provide the National Geospatial-Intelligence Agency with a Computer Vision capability to triage imagery, extract valuable contextual information, and support region reduction efforts. Clarifai will leverage foundational image classification models combined with custom models trained on features identified by the MIT research project “Places365” and any newly identified relevant features, to extract scene information, context, and landmarks from an image. Clarifai’s workflow capability will enable the orchestration of these models along with custom logic operators to perform complex region reduction tasks based on results from multiple feature extraction models. A major challenge with developing an automated solution for region reduction is developing the feature extraction capabilities and logic to mimic the complex human reasoning used for geolocating ground-level imagery. With current state-of-the-art deep learning capabilities, it would require extreme amounts of data and labelling for every region of interest to train a single model to directly classify the region. However, a human-AI combination approach could dramatically improve analyst efficiency and accuracy. Depending on the image and location, an analyst will use different combinations of contextual information to estimate the location of an image. For example, if a street sign or unique landmark is present in an urban environment, that may be the most important information. In contrast, the imagery of a natural environment, scene, and biome information may play a more important role. These two types of information are most optimally determined in different ways. Scene and biome identification is generally a classification problem. In contrast, there would be too many unique landmarks to train a classification model. A similarity or clustering approach would be more effective for identifying these landmarks. Furthermore, fine-tuning additional classification models for specific tasks would allow the system to rapidly adapt to new mission requirements and features of interest (e.g. hurricane damage, weapons, etc.) to support region reduction and targeting of images for further analysis. By orchestrating models optimized for different tasks and developing logic-based analytic approaches based on reference data and subject matter expertise, an automated system could reduce the overall region search space and provide relevant information with associated confidence.