Abstract
The University of Central Florida TransGeo invention is a pure transformer-based approach that takes full advantage of the strengths of transformers related to global information modeling and explicit position information encoding. The dominant Convolutional Neural Network (CNN)-based methods for cross-view image geo-localization rely on polar transform and fail to model global correlation.
In contrast, the UCF TransGeo approach addresses these limitations from a different perspective. The invention leverages the flexibility of transformer input and offers an attention-guided, non-uniform cropping method so that uninformative image patches are removed with a negligible drop in performance to reduce computation cost. The saved computation can be reallocated to increase resolution only for informative patches, resulting in performance improvement with no additional computation cost. This "attend and zoom-in" strategy is highly similar to human behavior when observing images. Remarkably, TransGeo achieves state-of-the-art results on both urban and rural datasets, with significantly less computation cost than CNN-based methods. It does not rely on polar transform and infers faster than CNN-based methods. Code is available at https://github.com/Jeff-Zilence/TransGeo2022.
Partnering Opportunity
The research team is seeking partners for licensing, research collaboration, or both.
Stage of Development
Prototype available.
Benefit
Achieves state-of-the-art results on both urban and rural datasetsOffers better accuracy with less computational costDoes not rely on a predefined transformation (polar transform) and infers faster than CNN-based methods.Market Application
Noisy-GPS refinementNavigationAugmented Reality (AR)Publications
TransGeo:
Transformer Is All You Need for Cross-view Image Geo-localization, 2022
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), proceedings
pages 1152-1161. DOI Bookmark: 10.1109/CVPR52688.2022.00123.
Brochure