🚀 Local LLM Chat with Hugging Face Transformers

This repository accompanies the YouTube tutorial demonstrating how to run a local large language model (LLM) using Hugging Face Transformers. The provided Python script implements a complete workflow for chat-style interactions with a locally stored model.

LLM MODEL

🌟 Features

Local Model Loading 🔄
Load pretrained causal language models from local checkpoints
Chat Template Formatting 💬
Supports conversational message formatting
Smart Tokenization 🔠
Includes attention masking and device allocation
Controlled Generation 🎛️
Configurable parameters:
- Temperature
- Top-p sampling
- Max new tokens
Cross-Platform Support 💻
Works on CPU (CUDA supported for GPU acceleration)

🛠️ Prerequisites

Python 3.7+
PyTorch (CPU/CUDA version)
Hugging Face Transformers
8GB+ RAM (16GB recommended)

📥 Installation

# Clone repository
git clone https://github.com/portalbh/SmolLM2.git
cd SmolLM2

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # Linux/macOS
venv\Scripts\activate    # Windows

# Install dependencies
pip install torch transformers

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
run-llm.py		run-llm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Local LLM Chat with Hugging Face Transformers

📚 Table of Contents

🌟 Features

🛠️ Prerequisites

📥 Installation

About

Releases

Packages

Languages

portalbh/SmolLM2

Folders and files

Latest commit

History

Repository files navigation

🚀 Local LLM Chat with Hugging Face Transformers

📚 Table of Contents

🌟 Features

🛠️ Prerequisites

📥 Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages