Abstract
This UCF invention introduces Kolmogorov-Arnold Attention
(KArAt), a groundbreaking alternative to the conventional softmax-based
attention in Transformer architectures. KArAt empowers models to learn
attention dynamically using user-defined basis functions, unlocking flexibility
and interpretability. In an era where AI systems demand adaptability for
complex tasks, this innovation addresses a critical limitation of fixed
attention mechanisms and sets the stage for next-generation Transformers in
vision and multimodal domains.
Detailed Description: KArAt replaces the fixed softmax activation in multi-head
self-attention with learnable functions derived from Kolmogorov-Arnold Networks
(KANs). It supports any basis (e.g., Fourier, wavelets) and uses low-rank
operator approximations to reduce computational overhead. Configurations
include blockwise mode (distinct operators per encoder block) and universal
mode (shared operators across blocks). Fourier-KArAt variants demonstrate
improved accuracy on CIFAR-10 and CIFAR-100 benchmarks, with competitive performance
on ImageNet-1K. Future work targets scaling for large multimodal models.
Benefit
Dynamic Adaptability: Learns attention functions tailored to data complexity.Enhanced Interpretability: Provides insights into learned activation patterns.Broad Applicability: Compatible with vision, language, and multimodal Transformers.Performance Gains: Outperforms vanilla ViTs on small-scale datasets.Market Application
Large Language Models: Improves adaptability in generative AI systems.Computer Vision: Enhances image classification, detection, and segmentation tasks.Autonomous Systems: Supports dynamic decision-making in robotics and self-driving cars.Healthcare AI: Enables flexible modeling for medical imaging and diagnostics.
Brochure