
Starred repositories
🔥 🔥 🔥 [NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies
Vision Transformer-based Dual-Stream Self-Supervised Pretraining Networks for Retinal OCT Classification
Code for our paper: TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection
MedViTV2: Medical Image Classification with KAN-Integrated Transformers and Dilated Neighborhood Attention
Implementation of the proposed Spline-Based Transformer from Disney Research
Transformers 3rd Edition
Development of Vision Transformer (ViT) networks for multi-class image classification.
A paper list of some recent Transformer-based CV works.
Recent Transformer-based CV and related works.
A comprehensive paper list of Transformer & Attention for Vision Recognition / Foundation Model, including papers, codes, and related websites.
Based on our paper "Implementing vision transformer for classifying 2D biomedical images" published in Scientific Reports (Nature)
Variants of Vision Transformer and its downstream tasks
Official implementation of Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting
[CVPR 2024] SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
Official implementation of paper FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization (ACM MM 2024).
A PyTorch implementation of the Transformer model in "Attention is All You Need".
A Spatio-temporal Transformer for 3D Human Motion Prediction
Graph-Aware Attention for Adaptive Dynamics in Transformers
[ECCVW 2022] The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"
Gabor and Laplacian of Gaussian Convolutional Swin Network ( GLoG-CSUnet), a novel architecture enhancing Transformer- based models by incorporating learnable radiomics features
A Light-weight and Multi-scale Network for Medical Image Segmentation