-
Korea University
- Seoul/Korea
- https://laetokang.tistory.com/
Stars
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
Github trending backup by everyday.
Curated list of useful LLM / Analytics / Datascience resources
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Anthropic's educational courses
Open-Sora: Democratizing Efficient Video Production for All
Repository for the Paper "Multi-LoRA Composition for Image Generation"
PALLAIDIUM - a generative AI movie studio integrated in the Blender Video Editor.
[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Stable Video Diffusion Training Code and Extensions.
머신러닝 입문자 혹은 스터디를 준비하시는 분들에게 도움이 되고자 만든 repository입니다. (This repository is intented for helping whom are interested in machine learning study)
[CSUR] A Survey on Video Diffusion Models
Enjoy the magic of Diffusion models!
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Making large AI models cheaper, faster and more accessible
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
A comprehensive list of awesome contrastive self-supervised learning papers.
Towards hot directions in industrial end to end speech recognition
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS…
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, L…
Speech-to-text server framework with next-gen Kaldi