Stars
MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark
Machine Learning Engineering Open Book
verl: Volcano Engine Reinforcement Learning for LLMs
AIR: Complex Instruction Generation via Automatic Iterative Refinement
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
aider is AI pair programming in your terminal
This repository collects an extensive list of awesome papers about Story Generation / Storytelling, primarily focusing on the era of Large Language Models (LLMs).
[EMNLP2024] Aligning Large Language Models on Information Extraction
mathsyouth / awesome-text-summarization
Forked from lipiji/App-DLA curated list of resources dedicated to text summarization
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
A curated list of Large Language Model (LLM) Interpretability resources.
A curated list of awesome responsible machine learning resources.
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
Must-read Papers on Knowledge Editing for Large Language Models.
A comprehensive, unified and modular event extraction toolkit.
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.