- Chicago
- http://www.yanzhangit.com/
Starred repositories
NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
A Conversational Speech Generation Model
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
"Your Fully-Automated Personal AI Assistant, and Open-Source & Cost-Efficient Alternative to OpenAI's Deep Research"
Make websites accessible for AI agents
Automate browser-based workflows with LLMs and Computer Vision
A small robot especially for rl training locomotion
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Solve Visual Understanding with Reinforced VLMs
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personali…
✨ 易上手的多平台 LLM 聊天机器人及开发框架 ✨ 平台支持 QQ、QQ频道、Telegram、微信、企微、飞书 | MCP 服务器、OpenAI、DeepSeek、Gemini、硅基流动、月之暗面、Ollama、OneAPI、Dify 等。附带 WebUI。
Open-sourced code for "HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit".
Wan: Open and Advanced Large-Scale Video Generative Models
一个全开源低成本的双足机器人(2万元($3000))A Fully Opensourced Humanoid Robot with only $3000
A simple screen parsing tool towards pure vision based GUI agent
Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation
MLGym A New Framework and Benchmark for Advancing AI Research Agents
Action Chunking Transformers with In-the-Wild Learning Framework