Skip to content

ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Notifications You must be signed in to change notification settings

xinyiW915/ReLaX-VQA

Repository files navigation

ReLaX-VQA

visitors GitHub stars Python arXiv

Official Code for the following paper:

X. Wang, A. Katsenou, and D. Bull. ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

☑️ TODO

  • Update reported_result.ipynb for performance comparisons
  • More updates coming soon... 😊

Performance

PWC PWC PWC

We evaluate the performance of ReLaX-VQA on four datasets. ReLaX-VQA has three different versions based on the training and testing strategies:

  • ReLaX-VQA: Trained and tested on each dataset with an 80%-20% random split.
  • ReLaX-VQA (w/o FT): Trained on LSVQ, and the frozen model was tested on other datasets.
  • ReLaX-VQA (w/ FT): Trained on LSVQ, and the frozen model was fine-tuned on other datasets.

Spearman’s Rank Correlation Coefficient (SRCC)

Model CVD2014 KoNViD-1k LIVE-VQC YouTube-UGC
ReLaX-VQA 0.8643 0.8535 0.7655 0.8014
ReLaX-VQA (w/o FT) 0.7845 0.8312 0.7664 0.8104
ReLaX-VQA (w/ FT) 0.8974 0.8720 0.8468 0.8469

Pearson’s Linear Correlation Coefficient (PLCC)

Model CVD2014 KoNViD-1k LIVE-VQC YouTube-UGC
ReLaX-VQA 0.8895 0.8473 0.8079 0.8204
ReLaX-VQA (w/o FT) 0.8336 0.8427 0.8242 0.8354
ReLaX-VQA (w/ FT) 0.9294 0.8668 0.8876 0.8652

More results can be found in reported_result.ipynb.

Proposed Model

The figure shows the overview of the proposed ReLaX-VQA framework. The architectures of ResNet-50 Stack (I) and ResNet-50 Pool (II) are provided in Fig.2 in the paper.

proposed_ReLaX-VQA_framework

Usage

📌 Install Requirement

The repository is built with Python 3.10.14 and can be installed via the following commands:

git clone https://github.com/xinyiW915/ReLaX-VQA.git
cd ReLaX-VQA
conda create -n relaxvqa python=3.10.14 -y
conda activate relaxvqa
pip install -r requirements.txt  

📥 Download UGC Datasets

The corresponding raw video datasets can be downloaded from the following sources:
LSVQ, KoNViD-1k, LIVE-VQC, YouTube-UGC, CVD2014.

The metadata for the experimented UGC dataset is available under ./metadata.

Once downloaded, place the datasets in ./ugc_original_videos or any other storage location of your choice.
Ensure that the video_path in the get_video_paths function inside main_relaxvqa_feats.py is updated accordingly.

🎬 Test Demo

Run the pre-trained models to evaluate the quality of a single video.

The model weights provided in ./model contain the best-performing saved weights from training.

To evaluate the quality of a specific video, run the following command:

python demo_test_gpu.py 
    -device <DEVICE> 
    -train_data_name <TRAIN_DATA_NAME> 
    -is_finetune <True/False> 
    -save_path <MODEL_PATH> 
    -video_type <DATASET_NAME> 
    -video_name <VIDEO_NAME> 
    -framerate <FRAMERATE>

Or simply try our demo video by running:

python demo_test_gpu.py

Training

Steps to train ReLaX-VQA from scratch on different datasets.

Extract Features

Run the following command to extract features from videos:

python main_relaxvqa_feats.py -device gpu -video_type youtube_ugc

Train Model

Train our model using extracted features:

python model_regression_simple.py -data_name youtube_ugc -feature_path ../features/ -save_path ../model/

For LSVQ, train the model using:

python model_regression.py -data_name lsvq_train -feature_path ../features/ -save_path ../model/

Fine-Tuning

To fine-tune the pre-trained model on a new dataset, modify train_data_name to match the dataset used for training, and test_data_name to specify the dataset for fine-tuning.

python model_finetune.py

Ablation Study

A detailed analysis of different components in ReLaX-VQA.

Spatio-Temporal Fragmentation & DNN Layer Stacking

Key techniques used in ReLaX-VQA:

  • Fragmentation with DNN layer stacking:

    python feature_fragment_layerstack.py
  • Fragmentation with DNN layer pooling:

    python feature_fragment_pool.py
  • Frame with DNN layer stacking:

    python feature_layerstack.py
  • Frame with DNN layer pooling:

    python feature_pool.py

Other Utilities

Excluding Greyscale Videos

We exclude greyscale videos in our experiments. You can use check_greyscale.pyto filter out greyscale videos from the VQA dataset you want to use.

python check_greyscale.py

Metadata Extraction

For easy extraction of metadata from your VQA dataset, use:

python extract_metadata_NR.py

Acknowledgment

This work was funded by the UKRI MyWorld Strength in Places Programme (SIPF00006/1) as part of my PhD study.

Citation

If you find this paper and the repo useful, please cite our paper 😊:

@article{wang2024relax,
      title={ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment},
      author={Wang, Xinyi and Katsenou, Angeliki and Bull, David},
      year={2024},
      eprint={2407.11496},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2407.11496}, 
}

Contact:

Xinyi WANG, [email protected]