SLTrain (NeurIPS 2024)

A repository containing beta implementation for SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining, which has been accepted to NeurIPS 2024. Preprint available on http://arxiv.org/abs/2406.02214.

Modeling for pretraining

The main idea is to re-parameterize linear layer with low-rank and sparse factors for improved parameter and memory efficiency.

W = BA + S,

where B, A model the low-rank component and S models the sparse component. S has a random sparsity pattern.

Motivation

Below, we show how the learned weights L + S enlarges the spectrum. In particular, the L component primarily learns the head singular value spectrum and the S component primarily learns the tail spectrum.

Results

Installation

Build cpp extensions via

cd ./sparse-lora
pip install .

Usage

Run the scripts placed in scripts/llm_pretrain/. Typical usage:

torchrun --standalone --nproc_per_node 1 torchrun_main.py \
    --model_config configs/llama_60m.json \
    --lr 0.003 \
    --peft_model sltrain\
    --optimizer adamw \
    --rank 128 \  
    --sp_ratio 0.03 \  # sparsity delta
    --batch_size 256 \
    --total_batch_size 512 \
    --num_training_steps 11000 \
    --warmup_steps 1100 \
    --weight_decay 0 \
    --dtype bfloat16 \
    --eval_every 1000 \
    --lora_alpha 32

Citation

@inproceedings{han2024sltrain,
  title={{SLTrain}: a sparse plus low-rank approach for parameter and memory efficient pretraining},
  author={Han, Andi and Li, Jiaxiang and Huang, Wei and Hong, Mingyi and Takeda, Akiko and Jawanpuria, Pratik and Mishra, Bamdev},
  booktitle = {Advances in Neural Information Processing Systems},
  volume = {37},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
configs		configs
figures		figures
peft_pretraining		peft_pretraining
scripts/llm_pretrain		scripts/llm_pretrain
sparse-lora		sparse-lora
splora		splora
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
torchrun_main.py		torchrun_main.py
train_utils.py		train_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SLTrain (NeurIPS 2024)

Modeling for pretraining

Motivation

Results

Installation

Usage

Citation

About

Releases

Packages

Contributors 2

Languages

License

andyjm3/SLTrain

Folders and files

Latest commit

History

Repository files navigation

SLTrain (NeurIPS 2024)

Modeling for pretraining

Motivation

Results

Installation

Usage

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages