Highlights
- Pro
-
-
LLaVolta Public
[NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression
-
-
3D-TransUNet Public
This is the official repository for the paper "3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers"
-
TransUNet Public
This repository includes the official project of TransUNet, presented in our paper: TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation.
-
ViTamin Public
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
-
pytorch-image-models Public
Forked from huggingface/pytorch-image-modelsPyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeX…
-
open_clip Public
Forked from mlfoundations/open_clipAn open source implementation of CLIP.
Jupyter Notebook Other UpdatedMay 5, 2024 -
TransMix Public
[CVPR 2022] This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.