The University of Central Florida invention is a capsule network approach to enhance Visual Question Answering (VQA) processes. The invention is a method that applies reasoning within a visual scene to determine a more accurate object, action or relational recognition. Most approaches rely on input feature maps from object detection models that are pretrained with the relevant object classes. This makes it necessary to restrict the scope to known object classes or to annotate the regions of relevant objects. The approaches also require the pretraining of an object detector, thus, limiting the extension of such methods to datasets with object-level annotation. This work focuses on weakly-supervised visual grounding based on VQA supervision.

Stage of Development

Prototype available.

Benefit

Simplicity

Significantly better at answer localization

Increases a system’s explain-ability

Market Application

Explainable AI

Accessibility applications for visually impaired people

Evidence-based decision-making systems

Dialog-based systems

Publications

Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules, arXivLabs, arXiv:2105.04836v1. IEEE Conference on Computer Vision and Pattern Recognition, 2021.

Self-Correcting and Self-Tracking Algorithm for Detection of Articulate Objects and Human Hand Gestures in Video Signals

Abstract

Researchers at the University of Central Florida have developed a novel hand tracking algorithm and system that eliminates the false positives of past video tracking systems. The ability to detect and track articulate objects such as human hands are important in human computer interactions (HCI) and surveillance. Thus, research in tracking these objects is a well- covered area in computer vision, giving rise to numerous tracking systems and algorithms. Common shortcomings of the existing algorithms are that they return a high number of false positives and are unable to recover if the tracking should fail in even one frame. In addition, current systems are only able to track hands in controlled environments and are limited by the number and type of gestures that they can detect. There also exists a group of algorithms that are dependent on predetermined information such as skin color. Thus, if the skin color in some scene is not in the device's repertoire, or if color is altered by lighting conditions, then the tracking system will not work.

Technical Details

The UCF invention eliminates the false positives of past video tracking systems and is also self- recovering; thus, it can recuperate from failed tracking. To recover or reconstruct lost frames, the algorithm can use information from frames in which the track was successful. In this algorithm, tracking is based on finger primitives and not on skin color. Therefore, success or failure is not limited by skin color. Such an algorithm could be incorporated into current computer vision systems to accurately track various objects and hand gestures without suffering from the pitfalls of currently used systems.

Benefit

Can detect subtle hand gestures

System is self-initializing and self-correcting in order to prevent abrupt failure in diverse environments

Algorithm is not dependent on skin color and requires no predetermination

Well-suited for uncontrolled environments in which illumination and shadows can vary from one frame to the next

Algorithm works even while no movement is present

Market Application

Hand tracking and surveillance applications

User interface systems

Video gaming systems

Automated surveillance technologies

Niels Da Vitoria Lobo

COMPUTER SCIENCE - ACADEMIC INSTRUCTION | COLLEGE OF ENGINEERING & COMPUTER SCIENCE

Research Terms & Keywords

Research Projects

Publications

Technologies

Abstract

Benefit

Market Application

Publications

Abstract

Benefit

Market Application

Co-Investigator Network

Websites

Contact Information

About

Discover

Connect

Keyword Search

Browse by STEM