Research Terms
Researchers at the University of Central Florida and the University of South Carolina have developed a fully autonomous drone-based track inspection system for railroad tracks. The system does not rely on GPS but uses optical images taken from the drone to identify the railroad track and navigate the drone along the track and perform inspection tasks. The onboard drone camera takes the track images, which are then processed to provide both navigation information for autonomous drone flight control and to evaluate component health.
The University of Central Florida invention is a privacy preservation action recognition system. The novel training framework removes privacy information from input video in a self-supervised manner without requiring privacy labels. Visual private information leakage is an emerging key issue for the fast-growing applications of video understanding, like activity recognition. Existing approaches for mitigating privacy leakage in action recognition require privacy labels along with the action labels from the video dataset. However, annotating frames of video datasets for privacy labels is not feasible. Recent developments in self-supervised learning (SSL) have unleashed the untapped potential of unlabeled data.
Technical Details
The UCF training framework consists of three main components: anonymization function, self-supervised privacy removal branch, and action recognition branch. Researchers trained the framework using a minimax optimization strategy to minimize the action recognition cost function and maximize the privacy cost function through a contrastive self-supervised loss. By employing existing protocols of known action and privacy attributes, the framework technology achieves a competitive action-privacy trade-off to the current state-of-the-art supervised methods. In addition, the invention introduces a new protocol to evaluate the generalization of the anonymization function to novel action and privacy attributes. The self-supervised framework outperforms existing supervised methods. Code is available at https://github.com/DAVEISHAN/SPAct.
Partnering Opportunity
The research team is seeking partners for licensing, research collaboration, or both.
Stage of Development
Prototype available.
SPAct: Self-supervised Privacy Preservation for Action Recognition, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18-24, 2022, New Orleans, LA, USA, 2022, pp. 20132-20141. DOI: 10.1109/CVPR52688.2022.01953.
The University of Central Florida TransGeo invention is a pure transformer-based approach that takes full advantage of the strengths of transformers related to global information modeling and explicit position information encoding. The dominant Convolutional Neural Network (CNN)-based methods for cross-view image geo-localization rely on polar transform and fail to model global correlation.
In contrast, the UCF TransGeo approach addresses these limitations from a different perspective. The invention leverages the flexibility of transformer input and offers an attention-guided, non-uniform cropping method so that uninformative image patches are removed with a negligible drop in performance to reduce computation cost. The saved computation can be reallocated to increase resolution only for informative patches, resulting in performance improvement with no additional computation cost. This "attend and zoom-in" strategy is highly similar to human behavior when observing images. Remarkably, TransGeo achieves state-of-the-art results on both urban and rural datasets, with significantly less computation cost than CNN-based methods. It does not rely on polar transform and infers faster than CNN-based methods. Code is available at https://github.com/Jeff-Zilence/TransGeo2022.
Partnering Opportunity
The research team is seeking partners for licensing, research collaboration, or both.
Stage of Development
Prototype available.
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), proceedings pages 1152-1161. DOI Bookmark: 10.1109/CVPR52688.2022.00123.
The University of Central Florida invention, Spectral Shift Aware Video Editing (SAVE), fine-tunes the spectral shift of the parameter space, significantly reducing the number of trainable parameters and improving computational efficiency. The UCF method includes a novel text-guided video editing framework and a spectral shift regularizer to capture motion information and preserve scene generation capability. It also incorporates frame attention for spatial and temporal consistency.
Text-to-Image (T2I) diffusion models have achieved remarkable success in synthesizing high-quality images conditioned on text prompts. Recent methods have tried to replicate the success by either training text-to-video (T2V) models on a large number of text-video pairs or adapting T2I models on text-video pairs independently. Although the latter is computationally less expensive, it still takes significant time for per-video adaptation. To address this issue, the UCF spectral-shift-aware adaptation framework enables the spectral shift of the parameter space to be fine-tuned instead of the parameters themselves.
Partnering Opportunity
The research team is seeking partners for licensing, research collaboration, or both.
Stage of Development
Prototype available.
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for Text-driven Video Editing, arXiv:2305.18670 [cs.CV], submitted on 30 May 2023 (v1), last revised 1 Dec 2023 (this version, v2), https://doi.org/10.48550/arXiv.2305.18670