Official Code for the following paper:
X. Wang, A. Katsenou, and D. Bull. ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment
- Update
reported_result.ipynb
for performance comparisons - More updates coming soon... 😊
We evaluate the performance of ReLaX-VQA on four datasets. ReLaX-VQA has three different versions based on the training and testing strategies:
- ReLaX-VQA: Trained and tested on each dataset with an 80%-20% random split.
- ReLaX-VQA (w/o FT): Trained on LSVQ, and the frozen model was tested on other datasets.
- ReLaX-VQA (w/ FT): Trained on LSVQ, and the frozen model was fine-tuned on other datasets.
Model | CVD2014 | KoNViD-1k | LIVE-VQC | YouTube-UGC |
---|---|---|---|---|
ReLaX-VQA | 0.8643 | 0.8535 | 0.7655 | 0.8014 |
ReLaX-VQA (w/o FT) | 0.7845 | 0.8312 | 0.7664 | 0.8104 |
ReLaX-VQA (w/ FT) | 0.8974 | 0.8720 | 0.8468 | 0.8469 |
Model | CVD2014 | KoNViD-1k | LIVE-VQC | YouTube-UGC |
---|---|---|---|---|
ReLaX-VQA | 0.8895 | 0.8473 | 0.8079 | 0.8204 |
ReLaX-VQA (w/o FT) | 0.8336 | 0.8427 | 0.8242 | 0.8354 |
ReLaX-VQA (w/ FT) | 0.9294 | 0.8668 | 0.8876 | 0.8652 |
More results can be found in reported_result.ipynb.
The figure shows the overview of the proposed ReLaX-VQA framework. The architectures of ResNet-50 Stack (I) and ResNet-50 Pool (II) are provided in Fig.2 in the paper.
The repository is built with Python 3.10.14 and can be installed via the following commands:
git clone https://github.com/xinyiW915/ReLaX-VQA.git
cd ReLaX-VQA
conda create -n relaxvqa python=3.10.14 -y
conda activate relaxvqa
pip install -r requirements.txt
The corresponding raw video datasets can be downloaded from the following sources:
LSVQ, KoNViD-1k, LIVE-VQC, YouTube-UGC, CVD2014.
The metadata for the experimented UGC dataset is available under ./metadata
.
Once downloaded, place the datasets in ./ugc_original_videos
or any other storage location of your choice.
Ensure that the video_path
in the get_video_paths
function inside main_relaxvqa_feats.py
is updated accordingly.
Run the pre-trained models to evaluate the quality of a single video.
The model weights provided in ./model
contain the best-performing saved weights from training.
To evaluate the quality of a specific video, run the following command:
python demo_test_gpu.py
-device <DEVICE>
-train_data_name <TRAIN_DATA_NAME>
-is_finetune <True/False>
-save_path <MODEL_PATH>
-video_type <DATASET_NAME>
-video_name <VIDEO_NAME>
-framerate <FRAMERATE>
Or simply try our demo video by running:
python demo_test_gpu.py
Steps to train ReLaX-VQA from scratch on different datasets.
Run the following command to extract features from videos:
python main_relaxvqa_feats.py -device gpu -video_type youtube_ugc
Train our model using extracted features:
python model_regression_simple.py -data_name youtube_ugc -feature_path ../features/ -save_path ../model/
For LSVQ, train the model using:
python model_regression.py -data_name lsvq_train -feature_path ../features/ -save_path ../model/
To fine-tune the pre-trained model on a new dataset, modify train_data_name
to match the dataset used for training, and test_data_name
to specify the dataset for fine-tuning.
python model_finetune.py
A detailed analysis of different components in ReLaX-VQA.
Key techniques used in ReLaX-VQA:
-
Fragmentation with DNN layer stacking:
python feature_fragment_layerstack.py
-
Fragmentation with DNN layer pooling:
python feature_fragment_pool.py
-
Frame with DNN layer stacking:
python feature_layerstack.py
-
Frame with DNN layer pooling:
python feature_pool.py
We exclude greyscale videos in our experiments. You can use check_greyscale.py
to filter out greyscale videos from the VQA dataset you want to use.
python check_greyscale.py
For easy extraction of metadata from your VQA dataset, use:
python extract_metadata_NR.py
This work was funded by the UKRI MyWorld Strength in Places Programme (SIPF00006/1) as part of my PhD study.
If you find this paper and the repo useful, please cite our paper 😊:
@article{wang2024relax,
title={ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment},
author={Wang, Xinyi and Katsenou, Angeliki and Bull, David},
year={2024},
eprint={2407.11496},
archivePrefix={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2407.11496},
}
Xinyi WANG, [email protected]