Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,856 692 Updated Mar 3, 2025

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 169 12 Updated Jul 12, 2024

ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 537 49 Updated Jun 9, 2024

NVlabs / GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

Python 755 52 Updated May 10, 2022

rosinality / vq-vae-2-pytorch

Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch

Python 1,704 278 Updated Feb 15, 2023

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,841 380 Updated Mar 14, 2024

lstrgar / self-supervised-phone-segmentation

Phoneme segmentation using pre-trained speech models

Python 55 10 Updated Nov 4, 2022

xinjli / alqalign

multilingual speech aligner

Python 72 5 Updated Nov 19, 2023

YuanGongND / uavm

Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".

Python 54 3 Updated Apr 20, 2023

kamperh / vqwordseg

Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Jupyter Notebook 36 8 Updated Mar 4, 2024

huckiyang / awesome-neural-reprogramming-prompting

A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022

Python 36 Updated Nov 30, 2023

jasonppy / word-discovery

Word Discovery in Visually Grounded, Self-Supervised Speech Models

Jupyter Notebook 26 7 Updated Dec 4, 2023

lucidrains / n-grammer-pytorch

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Python 73 1 Updated Dec 4, 2022

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 78,963 9,475 Updated Jan 4, 2025

KaosEngineer / structured-uncertainty

Python 10 1 Updated Sep 1, 2021

desh2608 / gmm-hmm-asr

Python implementation of simple GMM and HMM models for isolated digit recognition.

Python 63 22 Updated Feb 7, 2021

zhaoyanpeng / xcfg

X (weighted / probabilistic) Context-Free Grammars

Python 25 2 Updated Jan 30, 2024

my-yy / s2v_rc

Speech2Vec Reality Check

Python 82 4 Updated Feb 21, 2023

zhaoyanpeng / cpcfg

Fast and Modularized CFG-focused Models

Python 23 1 Updated Nov 8, 2023

rclone / rclone

"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files

Go 49,527 4,413 Updated Mar 26, 2025

prasmussen / gdrive

Google Drive CLI Client

Go 8,975 1,184 Updated Apr 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cheng-I Jeff Lai jefflai108

Achievements

Achievements

Block or report jefflai108

Stars

umbertocappellazzo / Llama-AVSR

Aria-K-Alethia / BigCodec

Stability-AI / stable-audio-tools

facebookresearch / MovieGenBench

xi-j / Mamba-ASR

LTH14 / mar

bytedance / SALMONN

mhamilton723 / DenseAV

kylebgorman / syllabify

open-mmlab / Amphion