pmultiqc is a MultiQC plugin for comprehensive quality control reporting of proteomics data. It generates interactive HTML reports with visualizations and metrics to help you assess the quality of your mass spectrometry-based proteomics experiments.
- Works with multiple proteomics data formats and analysis pipelines
- Generates interactive HTML reports with visualizations
- Provides comprehensive QC metrics for MS data
- Supports different quantification methods (LFQ, TMT, DIA)
- Integrates with the MultiQC framework
pmultiqc supports the following data sources:
-
quantms pipeline output files:
experimental_design.tsv
: Experimental design file*.mzTab
: Results of the identification*msstats*.csv
: MSstats/MSstatsTMT input files*.mzML
: Spectra files*ms_info.tsv
: MS quality control information*.idXML
: Identification results*.yml
: Pipeline parameters (optional)diann_report.tsv
: DIA-NN main report (DIA analysis only)
-
MaxQuant result files:
parameters.txt
: Analysis parametersproteinGroups.txt
: Protein identification resultssummary.txt
: Summary statisticsevidence.txt
: Peptide evidencemsms.txt
: MS/MS scan informationmsmsScans.txt
: MS/MS scan details
-
mzIdentML files:
*.mzid
: Identification results*.mzML
or*.mgf
: Corresponding spectra files
# Install from PyPI
pip install pmultiqc
# Or install from source
git clone https://github.com/bigbio/pmultiqc
cd pmultiqc
pip install -e .
pmultiqc is used as a plugin for MultiQC. After installation, you can run it using the MultiQC command-line interface.
multiqc {analysis_dir} -o {output_dir}
Where:
{analysis_dir}
is the directory containing your proteomics data files{output_dir}
is the directory where you want to save the report
# Basic usage
multiqc /path/to/quantms/results -o ./report
# With specific options
multiqc /path/to/quantms/results -o ./report --remove_decoy --condition factor
multiqc --parse_maxquant /path/to/maxquant/results -o ./report
multiqc --mzid_plugin /path/to/mzid/files -o ./report
Option | Description | Default |
---|---|---|
--raw |
Keep filenames in experimental design output as raw | False |
--condition |
Create conditions from provided columns | - |
--remove_decoy |
Remove decoy peptides when counting | True |
--decoy_affix |
Pre- or suffix of decoy proteins in their accession | DECOY_ |
--contaminant_affix |
The contaminant prefix or suffix | CONT |
--affix_type |
Location of the decoy marker (prefix or suffix) | prefix |
--disable_plugin |
Disable pmultiqc plugin | False |
--quantification_method |
Quantification method for LFQ experiment | feature_intensity |
--disable_table |
Disable protein/peptide table plots for large datasets | False |
--ignored_idxml |
Ignore idXML files for faster processing | False |
--parse_maxquant |
Generate reports based on MaxQuant results | False |
--mzid_plugin |
Generate reports based on mzIdentML files | False |
pmultiqc generates a comprehensive report with multiple sections:
- Experimental Design: Overview of the dataset structure
- Pipeline Performance Overview: Key metrics including:
- Contaminants Score
- Peptide Intensity
- Charge Score
- Missed Cleavages
- ID rate over RT
- MS2 OverSampling
- Peptide Missing Value
- Summary Table: Spectra counts, identification rates, peptide and protein counts
- MS1 Information: Quality metrics at MS1 level
- Pipeline Results Statistics: Overall identification results
- Number of Peptides per Protein: Distribution of peptide counts per protein
- Peptide Table: First 500 peptides in the dataset
- PSM Table: First 500 PSMs (Peptide-Spectrum Matches)
- Spectra Tracking: Summary of identification results by file
- Search Engine Scores: Distribution of search engine scores
- Precursor Charges Distribution: Distribution of precursor ion charges
- Number of Peaks per MS/MS Spectrum: Peak count distribution
- Peak Intensity Distribution: MS2 peak intensity distribution
- Oversampling Distribution: Analysis of MS2 oversampling
- Delta Mass: Mass accuracy distribution
- Peptide/Protein Quantification Tables: Quantitative levels across conditions
You can find example reports in the docs
directory:
- LFQ Example
- TMT Example
- DIA Example
- MaxQuant Example
- mzIdentML with mzML Example
- mzIdentML with MGF Example
To contribute to pmultiqc:
- Fork the repository
- Clone your fork:
git clone https://github.com/YOUR-USERNAME/pmultiqc
- Create a feature branch:
git checkout -b new-feature
- Make your changes
- Install in development mode:
pip install -e .
- Test your changes:
cd tests && multiqc resources/LFQ -o ./
- Commit your changes:
git commit -am 'Add new feature'
- Push to the branch:
git push origin new-feature
- Submit a pull request
This project is licensed under the terms of the LICENSE file included in the repository.
If you use pmultiqc in your research, please cite:
pmultiqc: A MultiQC plugin for proteomics quality control
https://github.com/bigbio/pmultiqc