Research Terms
The University of Central Florida invention is a capsule network approach to enhance Visual Question Answering (VQA) processes. The invention is a method that applies reasoning within a visual scene to determine a more accurate object, action or relational recognition. Most approaches rely on input feature maps from object detection models that are pretrained with the relevant object classes. This makes it necessary to restrict the scope to known object classes or to annotate the regions of relevant objects. The approaches also require the pretraining of an object detector, thus, limiting the extension of such methods to datasets with object-level annotation. This work focuses on weakly-supervised visual grounding based on VQA supervision.
Stage of Development
Prototype available.
Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules, arXivLabs, arXiv:2105.04836v1. IEEE Conference on Computer Vision and Pattern Recognition, 2021.
Researchers at the University of Central Florida have developed a novel hand tracking algorithm and system that eliminates the false positives of past video tracking systems. The ability to detect and track articulate objects such as human hands are important in human computer interactions (HCI) and surveillance. Thus, research in tracking these objects is a well- covered area in computer vision, giving rise to numerous tracking systems and algorithms. Common shortcomings of the existing algorithms are that they return a high number of false positives and are unable to recover if the tracking should fail in even one frame. In addition, current systems are only able to track hands in controlled environments and are limited by the number and type of gestures that they can detect. There also exists a group of algorithms that are dependent on predetermined information such as skin color. Thus, if the skin color in some scene is not in the device's repertoire, or if color is altered by lighting conditions, then the tracking system will not work.
Technical Details
The UCF invention eliminates the false positives of past video tracking systems and is also self- recovering; thus, it can recuperate from failed tracking. To recover or reconstruct lost frames, the algorithm can use information from frames in which the track was successful. In this algorithm, tracking is based on finger primitives and not on skin color. Therefore, success or failure is not limited by skin color. Such an algorithm could be incorporated into current computer vision systems to accurately track various objects and hand gestures without suffering from the pitfalls of currently used systems.