Services
MuPeXI - 1.2
Prediction of neo-epitopes from tumor sequencing data
Mutant peptide extractor and informer (MuPeXI)
Extracts user defined peptides lengths around missense variant mutations, indels and frameshifts from a VCF file. Information from each mutation is annotated together with the mutant and normal peptides in the file output.
Submission
CITATIONS
For publication of results, please cite:
-
MuPeXI: Prediction of neo-epitopes from tumor sequencing data
Anne-Mette Bjerregaard, Morten Nielsen, Sine R. Hadrup, Zoltan Szallasi, and Aron C. Eklund
Cancer Immunol Immunother. 2017 Apr 20. doi: 10.1007/s00262-017-2001-3.
PubMed ID: 28429069
Full text
MuPeXI relies on binding predictions from NetMHCpan; for publication of results please cite NetMHCpan in addition to MuPeXI.
-
NetMHCpan - MHC class I binding prediction beyond humans
Ilka Hoof, Bjoern Peters, John Sidney, Lasse Eggers Pedersen, Ole Lund, Soren Buus, and Morten Nielsen
PMID: 19002680
Full text -
NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence.
Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, Justesen S, Roeder G, Peters B, Sette A, Lund O, Buus S.
PMID: 17726526
Full text
PORTABLE VERSION
MuPeXI is available on GitHub.
Instructions
1. Upload files
-
VCF file:
Upload a VCF file containing tumor specific somatic variant calls, preferably detected by the variant caller MuTect2. MuTect2 calls both point mutations and indels.
If MuTect or MuTect2 has not been used for variant calling, genomic mutant allele frequencies will not be taken into account.
The NGS analysis should preferably have been run by aligning and using the GRCh38 reference.
If not, you can choose to upload a VCF from HG19 alignment, and check off the HG19 box; MuPeXI will then run a liftover of the VCF file to GRCh38.
Expression file:
Upload a tab-delimited file with expression values for each Ensembl transcript ID.
Format: Ensembl Transcript ID <tab> Expression value (TPM) <tab> variance on call.
This file can be obtained through RNAseq analysis with, for example, Kallisto.
Transcript IDs must correspond to the GRCh38 v78 coordinates.
2. Select or enter the HLA alleles
-
All HLA alleles present in the tumor should be provided.
Example: HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03
3. Select peptide lengths
-
Specify peptide length. Multiple lengths can be chosen.
4. Submit the job
Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.
Format of MuPeXI-1.1 output
DESCRIPTION
The prediction output for each peptide pair consists of the following columns:
- HLA allele Allele name
- Normal peptide Peptide from reference corresponding to the mutant peptide.
- Normal MHC affinity Predicted binding affinity of normal peptide in nanoMolar units.
- Normal MHC % rank %Rank of prediction score for nomal peptides.
- Mutant peptide The extracted mutant peptide.
- Mutant MHC affinity Predicted binding affinity of mutant peptide in nanoMolar units.
- Mutant MHC % rank %Rank of prediction score for mutant peptides.
- Gene ID Ensembl gene ID
- Transcript ID Ensembl transcript ID
- Amino acid change Amino acid change annotated in VEP file.
- Allele Frequency Genomic mutant allele frequency detected by MuTect2.
- Mismatches Number of mismatches between normal and mutant peptide.
- Peptide position Position of amino acid change in the peptide. Can be a range in the case of insertions and frameshifts.
- Chr Chromosome annotated in the VEP file.
- Genomic position genome nucleotide position annotated in the VEP file.
- Protein position amino acid position annotated in the VEP file.
- Mutation consequence the consequence annotated in the VEP file translated into single-letter abbrevations:
M: Missense variant
I: Inframe insertion
D: Inframe deletion
F: Frameshift variant
- Gene symbol HUGO symbol corresponding to the Ensembl gene ID.
- Cancer driver gene Yes if the HUGO symbol is in the COSMIC reference list, No if it is not.
- Expression Level Expression of the transcript which the mutant peptide was extracted from.
- Mutant affinity score Calculated binding affinity score of the mutant peptide, based on a negative logistic function of the mutant MHC %Rank score. This is used to calculate the final prioritization score.
- Normal affinity score Calculated binding affinity score of the normal peptide, based on a positive logistic function of the normal MHC %Rank score. This is used to calculate the final prioritization score.
- Expression score Calculated expression score of the transcript expression level. This is used to calculate the final prioritization score.
- Priority score calculated prioritization dependent on HLA binding affinity of mutant and normal peptides, gene expression, and allele frequency.
NetMHCpan output
%Rank of prediction score to a set of 200.000 random natural 9mer peptides. For more information see the NetMHCpan4.0 output format.
EXAMPLE OUTPUT
.mupexi file
# VERSION: MuPeXI 1.1 # CALL: /usr/cbs/bio/src/MuPeXI-1.1/mupexi/MuPeXI.py -c /usr/cbs/bio/src/MuPeXI-1.1/config.ini -w -m -v /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.0 -e /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.1 -a HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03 -l 9 # DATE: Monday 15 of August 2016 # TIME: 15:34:35 # PWD: /net/athena/usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7 HLA_allele Norm_peptide Norm_MHCAffinity Norm_MHCrank Mut_peptide Mut_MHCAffinity Mut_MHCrank Gene_ID Transcript_ID Amino_Acid_Change Allele_Frequency Mismatches peptide_position Chr Genomic_Position Protein_position Mutation_Consequence Gene_Symbol Cancer_Driver_Gene Expression_Level Mutant_affinity_score Normal_affinity_score Expression_score priority_Score HLA-B07:02 .L....... 13351.1 9.5 SPDLGGSKF 267.2 0.7 ENSG00000156510 ENST00000354624 L/P 0.784 1 2 10 69232785 83 M HKDC1 No 61.0165798 0.998498817743263 5.175555005801869e-17 1.0 78 HLA-A02:01 .R.R...D. 26906.1 32.0 KMYLKIVEV 8.7 0.1 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.9999251537724895 7.175095973164411e-66 0.9841444919059494 64 HLA-C03:03 G......G. 5847.9 6.0 FSVIVCHKM 272.0 0.8 ENSG00000048028 ENST00000003302 FQ/X 0.650 2 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.9975273768433653 2.0611536181902037e-09 0.9841444919059494 64 HLA-C07:02 A.I..E... 5100.7 4.0 MYLKIVEVI 1136.8 0.8 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.9975273768433653 4.5397868702434395e-05 0.9841444919059494 64 HLA-A31:01 T...N.... 3096.9 6.0 KIVEVIQKR 99.7 0.9 ENSG00000048028 ENST00000003302 FQ/X 0.650 2 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.995929862284104 2.0611536181902037e-09 0.9841444919059494 64 HLA-C03:03 .....G.VY 4781.8 5.0 VIVCHKMYL 643.9 1.4 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.9525741268224334 3.059022269256247e-07 0.9841444919059494 61 HLA-A31:01 ......DP. 1025.1 3.5 LFSVIVCHK 214.0 1.4 ENSG00000048028 ENST00000003302 FQ/X 0.650 2 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.9525741268224334 0.0005527786369235996 0.9841444919059494 61 HLA-C07:02 QN...M... 9837.7 7.5 CKSFSICLL 2315.8 1.6 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.8807970779778823 1.1399918530430558e-12 0.9841444919059494 56 HLA-A31:01 T.F..R... 14.2 0.09 IVCHKMYLK 131.1 1.0 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.9933071490757153 0.9999288038061921 0.9841444919059494 56 HLA-A31:01 ....E.... 506.9 2.5 TILDKLVQR 68.3 0.6 ENSG00000156096 ENST00000305107 E/K 0.563 1 5 4 69495729 45 M UGT2B4 No 2.44878178 0.9990889488055994 0.07585818002124355 0.9878510119201874 53 HLA-A02:01 E........ 9137.4 13.0 KLVQRGHEV 162.5 1.4 ENSG00000156096 ENST00000305107 E/K 0.563 1 1 4 69495729 45 M UGT2B4 No 2.44878178 0.9525741268224334 1.299581425007503e-24 0.9878510119201874 53 HLA-C07:02 .....G.VY 8919.0 6.5 VIVCHKMYL 2701.8 1.9 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.6224593312018547 1.6918979223288784e-10 0.9841444919059494 40 HLA-C07:02 S..Y...L. 3438.9 2.5 NFEDLFSVI 2684.2 1.9 ENSG00000048028 ENST00000003302 FQ/X 0.650 3 1:9 11 113834326-113834329 181 F USP28 No 2.31471308 0.6224593312018547 0.07585818002124355 0.9841444919059494 39 HLA-B07:02 ...R..... 31.9 0.12 KPGCKTNQL 43.2 0.17 ENSG00000169925 ENST00000371834 R/C 0.333 1 4 9 134053378 34 M BRD3 Yes 1.86032922 0.9998937914787018 0.9999172827771484 0.9611149444887752 32 HLA-C07:02 R........ 1301.2 0.9 CKTNQLQYM 2021.9 1.4 ENSG00000169925 ENST00000371834 R/C 0.333 1 1 9
.log file
# VERSION: MuPeXI 1.1 # CALL: /usr/cbs/bio/src/MuPeXI-1.1/mupexi/MuPeXI.py -c /usr/cbs/bio/src/MuPeXI-1.1/config.ini -w -m -v /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.0 -e /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.1 -a HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03 -l 9 # DATE: Monday 15 of August 2016 # TIME: 15:34:35 # PWD: /net/athena/usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7 ---------------------------------------------------------------------------------------------------------- MuPeX ---------------------------------------------------------------------------------------------------------- Reading protein reference file: Found 99436 sequences, with 0 [9]mers of which 0 were unique peptides Reading VEP file: Found 22 irrelevant mutation consequences which were discarded Found 29 missense variant mutation(s) 0 insertion(s) 0 deletion(s) 3 frameshift variant mutation(s) In 7 genes and 32 transcripts Checking peptides: 0 peptides matched a normal peptide and were discarded 0 peptides included unsupported symbols (e.g. *, X, U) and were discarded Final Result: 390 potential mutant peptides MuPeX Runtime: 0:00:15.476422 ---------------------------------------------------------------------------------------------------------- MuPeI ---------------------------------------------------------------------------------------------------------- Reading through MuPex file: Found 390 peptides of which 105 were unique Detecting HLA alleles: Detected the following 6 HLA alleles: HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03 of which 6 were unique Running NetMHCpan 2.8: Analyzed 6 HLA allele(s) NetMHCpan runtime: 0:00:05.053376 MuPeI Runtime: 0:00:09.374437 TOTAL Runtime: 0:00:31.901699
.fasta file
>DTU|ENST00000436792_184838714|M_D9Y SAQIVSAAYKVDAGLPT >DTU|ENST00000422105_184838714|M_D9Y SAQIVSAAYKVDAGLPT >DTU|ENST00000452666_184838714|M_D9Y SAQIVSAAYKVDAGLPT >DTU|ENST00000380779_10567210|M_S9F AFDANTMTFAEKVLCQF >DTU|ENST00000424463_184838714|M_D9Y SAQIVSAAYKVDAGLPT >DTU|ENST00000510114_69495729|M_E9K MNIKTILDKLVQRGHEV >DTU|ENST00000610939_10567210|M_S9F AFDANTMTFAEKVLCQF >DTU|ENST00000446204_184838714|M_D9Y SAQIVSAAYKVDAGLPT >DTU|ENST00000512583_69495729|M_E9K MNIKTILDKLVQRGHEV >DTU|ENST00000305107_69495729|M_E9K MNIKTILDKLVQRGHEV >DTU|ENST00000317552_10567210|M_S9F AFDANTMTFAEKVLCQF >DTU|ENST00000545540_113834326-113834329|F_FQ9:51X FSAVIQSLNNCLNFEDLFSVIVCHKMYLKIVEVIQKREISCLCKSFSICLL >DTU|ENST00000426319_184838714|M_D9Y TFSEFNPYYKVDAGLPT
References
Article abstract
MuPeXI: Prediction of neo-epitopes from tumor sequencing data
Anne-Mette Bjerregaard, Morten Nielsen, Sine R. Hadrup, Zoltan Szallasi, Aron Charles Eklund
Personalization of immunotherapies such as cancer vaccines and adoptive T cell therapy depends on identification of patient-specific neo-epitopes that can be specifically targeted. MuPeXI, the Mutant Peptide Extractor and Informer, is a program to identify tumor-specific peptides and assess their potential to be neo-epitopes. The program input is a file with somatic mutation calls, a list of HLA types, and optionally a gene expression profile. The output is a table with all tumor-specific peptides derived from nucleotide substitutions, insertions, and deletions, along with comprehensive annotation, including HLA binding and similarity to normal peptides. The peptides are sorted according to a priority score which is intended to roughly predict immunogenicity. We applied MuPeXI to three tumors for which predicted MHC-binding peptides had been screened for T cell reactivity, and found that MuPeXI was able to prioritize immunogenic peptides with an area under the curve of 0.63. Compared to other available tools, MuPeXI provides more information and is easier to use. MuPeXI is available as stand-alone software and as a web server at http://www.cbs.dtu.dk/services/MuPeXI.
Supplementary Tables