This SBIR Phase I project will study the feasibility of automated speaking assessment to help students improve their oral communication skills. According to a survey of human resource officials, only 25 percent of today's college graduates enter the workforce with well-developed speaking skills. This means many people are unable to effectively persuade an audience of their position, thus limiting their ability to sell new ideas and be successful in their jobs. The project will investigate novel research in speech technology that enables students to receive objective, personalized feedback at any time. By reinforcing communication skills through self-paced practice and feedback, users will be better prepared with the communications skills necessary to perform job tasks and move into leadership roles. The completed system will initially be offered to over two million people in the United States who participate in public speaking training annually. Potential future applications for the technology include teacher assessments, call center monitoring, interview training, role playing, human resources assessments, patient care, services for the deaf, language learning and student assessments.
This project will develop key concepts for automated public speaking assessment such that a student's vocal delivery can be objectively measured and presented in a manner that creates an independent, personalized learning experience. Linking listener perceptions to speech behaviors is a novel direction in automated assessment for speech. Automated assessment for speech has already been demonstrated in the area of spoken language proficiency, which leverages automated speech recognition and semantic analysis. Automated voice assessment has also been utilized in lie detection and emotion detection, which focus on autonomic responses in the user's voice, such as when stress affects the vocal cords. The hypothesis behind this SBIR project is that software can help speakers to consciously use and modify non-semantic speech behaviors to produce more desirable listener perceptions. The Phase 1 objectives are to identify key features of voice that can be used to predict audience perception and develop initial software models to estimate aspects of audience perception. To achieve these objectives, a combination of expert feature enumeration, deep learning feature identification and machine learning will be applied and iteratively tested against a large corpus of actor voices and human perception ratings.