Abstract
The University of Central Florida invention provides a low-cost system that can significantly reduce the annotation cost for videos. Video activity detection requires annotations at every frame, which drastically increases the labeling cost. As a solution, the UCF invention offers a way to greatly reduce annotation costs while using only a few annotations. An example application is preparing large-scale video datasets for various video analysis tasks such as video tracking, detection and segmentation.
Technical Details: The invention’s Active Sparse Labeling (ASL) algorithm identifies the usefulness of each frame of a video. ASL then suggests the frames and videos that can improve dense video understanding tasks such as activity detection in videos. Along with the algorithm, the invention uses a unique method called Spatio-Temporal Weighted loss (STeW loss) to train video models using datasets with sparsely annotated frames. The invention works in two stages, first, it trains a deep-learning video model using very sparse frames and then uses the trained deep-learning model to select more frames for annotation based on their utility value. When tested on public benchmark datasets, UCF-101 and J-HMDB, with more than 400,000 frames, the invention effectively reduced annotation cost by 90 percent and learned action detection in videos using only 10 percent of annotated video frames.
Partnering Opportunity: The research team is seeking partners for licensing, research collaboration, or both.
Stage of Development: Prototype available.
Benefit
Can save annotation cost by 90 percentTrains deep-learning models using less than 10 percent of the annotated video datasetReduces redundancy in the selected frames, thus reducing the overall annotation cost by selecting fewer high-utility framesMarket Application
Companies working in big data collection and data labeling of videosPublications
Are All
Frames Equal? Active Sparse Labeling for Video
Action Detection, 36th Conference on Neural Information
Processing Systems (NeurIPS 2022).