DTU Health Tech

Department of Health Technology

We recently made large changes to the webserver infrastructure, so you might experience errors. Please report issues to health-master@dtu.dk

MuPeXI - 1.2

Prediction of neo-epitopes from tumor sequencing data

Mutant peptide extractor and informer (MuPeXI)

Extracts user defined peptides lengths around missense variant mutations, indels and frameshifts from a VCF file. Information from each mutation is annotated together with the mutant and normal peptides in the file output.

Submission


Submit a VCF file (example of a VCF):
Please ensure your VCF file has been filtered to remove non-somatic variants!

Submit an expression file - optional (example of an expression file):
Select whether expression data is quantified at the level of genes or transcripts:



Select HLA locus

Select allele (max 6 per submission) or type allele names (e.g. HLA-A01:01) separated by commas (and no spaces). Max 6 alleles per submission)

For a list of allowed allele names click here List of MHC allele names.

Select peptide length (multiple lengths are possible):


Output FASTA file with long peptides - mutation in the middle 

Submit HG19-aligned VCF file (liftover to GRCh38 will be performed) 


Restrictions:
Max 6 MHC alleles per submission.

Confidentiality:
The files are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:

  • MuPeXI: Prediction of neo-epitopes from tumor sequencing data
    Anne-Mette Bjerregaard, Morten Nielsen, Sine R. Hadrup, Zoltan Szallasi, and Aron C. Eklund
    Cancer Immunol Immunother. 2017 Apr 20. doi: 10.1007/s00262-017-2001-3.
    PubMed ID: 28429069
    Full text

MuPeXI relies on binding predictions from NetMHCpan; for publication of results please cite NetMHCpan in addition to MuPeXI.

  • NetMHCpan - MHC class I binding prediction beyond humans
    Ilka Hoof, Bjoern Peters, John Sidney, Lasse Eggers Pedersen, Ole Lund, Soren Buus, and Morten Nielsen
    PMID: 19002680
    Full text
  • NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence.
    Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, Justesen S, Roeder G, Peters B, Sette A, Lund O, Buus S.
    PMID: 17726526
    Full text

PORTABLE VERSION

MuPeXI is available on GitHub.

Instructions


1. Upload files

    VCF file:
    Upload a VCF file containing tumor specific somatic variant calls, preferably detected by the variant caller MuTect2. MuTect2 calls both point mutations and indels.
    If MuTect or MuTect2 has not been used for variant calling, genomic mutant allele frequencies will not be taken into account.
    The NGS analysis should preferably have been run by aligning and using the GRCh38 reference.
    If not, you can choose to upload a VCF from HG19 alignment, and check off the HG19 box; MuPeXI will then run a liftover of the VCF file to GRCh38.

    Expression file:
    Upload a tab-delimited file with expression values for each Ensembl transcript ID.
    Format: Ensembl Transcript ID <tab> Expression value (TPM) <tab> variance on call.
    This file can be obtained through RNAseq analysis with, for example, Kallisto.
    Transcript IDs must correspond to the GRCh38 v78 coordinates.

2. Select or enter the HLA alleles

    All HLA alleles present in the tumor should be provided.
    Example: HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03

3. Select peptide lengths

    Specify peptide length. Multiple lengths can be chosen.

4. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.



Format of MuPeXI-1.1 output



DESCRIPTION


The prediction output for each peptide pair consists of the following columns:

  • HLA allele Allele name

  • Normal peptide Peptide from reference corresponding to the mutant peptide.

  • Normal MHC affinity Predicted binding affinity of normal peptide in nanoMolar units.

  • Normal MHC % rank %Rank of prediction score for nomal peptides.

  • Mutant peptide The extracted mutant peptide.

  • Mutant MHC affinity Predicted binding affinity of mutant peptide in nanoMolar units.

  • Mutant MHC % rank %Rank of prediction score for mutant peptides.

  • Gene ID Ensembl gene ID

  • Transcript ID Ensembl transcript ID

  • Amino acid change Amino acid change annotated in VEP file.

  • Allele Frequency Genomic mutant allele frequency detected by MuTect2.

  • Mismatches Number of mismatches between normal and mutant peptide.

  • Peptide position Position of amino acid change in the peptide. Can be a range in the case of insertions and frameshifts.

  • Chr Chromosome annotated in the VEP file.

  • Genomic position genome nucleotide position annotated in the VEP file.

  • Protein position amino acid position annotated in the VEP file.

  • Mutation consequence the consequence annotated in the VEP file translated into single-letter abbrevations:

      M: Missense variant

      I: Inframe insertion

      D: Inframe deletion

      F: Frameshift variant

  • Gene symbol HUGO symbol corresponding to the Ensembl gene ID.

  • Cancer driver gene Yes if the HUGO symbol is in the COSMIC reference list, No if it is not.

  • Expression Level Expression of the transcript which the mutant peptide was extracted from.

  • Mutant affinity score Calculated binding affinity score of the mutant peptide, based on a negative logistic function of the mutant MHC %Rank score. This is used to calculate the final prioritization score.

  • Normal affinity score Calculated binding affinity score of the normal peptide, based on a positive logistic function of the normal MHC %Rank score. This is used to calculate the final prioritization score.

  • Expression score Calculated expression score of the transcript expression level. This is used to calculate the final prioritization score.

  • Priority score calculated prioritization dependent on HLA binding affinity of mutant and normal peptides, gene expression, and allele frequency.


NetMHCpan output

%Rank of prediction score to a set of 200.000 random natural 9mer peptides. For more information see the NetMHCpan4.0 output format.




EXAMPLE OUTPUT

.mupexi file

# VERSION:	MuPeXI 1.1
# CALL:		/usr/cbs/bio/src/MuPeXI-1.1/mupexi/MuPeXI.py -c /usr/cbs/bio/src/MuPeXI-1.1/config.ini -w -m -v /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.0 -e /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.1 -a HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03 -l 9
# DATE:		Monday 15 of August 2016
# TIME:		15:34:35
# PWD:		/net/athena/usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7
HLA_allele	Norm_peptide	Norm_MHCAffinity	Norm_MHCrank	Mut_peptide	Mut_MHCAffinity	Mut_MHCrank	Gene_ID	Transcript_ID	Amino_Acid_Change	Allele_Frequency	Mismatches	peptide_position	Chr	Genomic_Position	Protein_position	Mutation_Consequence	Gene_Symbol	Cancer_Driver_Gene	Expression_Level	Mutant_affinity_score	Normal_affinity_score	Expression_score	priority_Score
HLA-B07:02	.L.......	13351.1	9.5	SPDLGGSKF	267.2	0.7	ENSG00000156510	ENST00000354624	L/P	0.784	1	2	10	69232785	83	M	HKDC1	No	61.0165798	0.998498817743263	5.175555005801869e-17	1.0	78
HLA-A02:01	.R.R...D.	26906.1	32.0	KMYLKIVEV	8.7	0.1	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.9999251537724895	7.175095973164411e-66	0.9841444919059494	64
HLA-C03:03	G......G.	5847.9	6.0	FSVIVCHKM	272.0	0.8	ENSG00000048028	ENST00000003302	FQ/X	0.650	2	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.9975273768433653	2.0611536181902037e-09	0.9841444919059494	64
HLA-C07:02	A.I..E...	5100.7	4.0	MYLKIVEVI	1136.8	0.8	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.9975273768433653	4.5397868702434395e-05	0.9841444919059494	64
HLA-A31:01	T...N....	3096.9	6.0	KIVEVIQKR	99.7	0.9	ENSG00000048028	ENST00000003302	FQ/X	0.650	2	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.995929862284104	2.0611536181902037e-09	0.9841444919059494	64
HLA-C03:03	.....G.VY	4781.8	5.0	VIVCHKMYL	643.9	1.4	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.9525741268224334	3.059022269256247e-07	0.9841444919059494	61
HLA-A31:01	......DP.	1025.1	3.5	LFSVIVCHK	214.0	1.4	ENSG00000048028	ENST00000003302	FQ/X	0.650	2	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.9525741268224334	0.0005527786369235996	0.9841444919059494	61
HLA-C07:02	QN...M...	9837.7	7.5	CKSFSICLL	2315.8	1.6	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.8807970779778823	1.1399918530430558e-12	0.9841444919059494	56
HLA-A31:01	T.F..R...	14.2	0.09	IVCHKMYLK	131.1	1.0	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.9933071490757153	0.9999288038061921	0.9841444919059494	56
HLA-A31:01	....E....	506.9	2.5	TILDKLVQR	68.3	0.6	ENSG00000156096	ENST00000305107	E/K	0.563	1	5	4	69495729	45	M	UGT2B4	No	2.44878178	0.9990889488055994	0.07585818002124355	0.9878510119201874	53
HLA-A02:01	E........	9137.4	13.0	KLVQRGHEV	162.5	1.4	ENSG00000156096	ENST00000305107	E/K	0.563	1	1	4	69495729	45	M	UGT2B4	No	2.44878178	0.9525741268224334	1.299581425007503e-24	0.9878510119201874	53
HLA-C07:02	.....G.VY	8919.0	6.5	VIVCHKMYL	2701.8	1.9	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.6224593312018547	1.6918979223288784e-10	0.9841444919059494	40
HLA-C07:02	S..Y...L.	3438.9	2.5	NFEDLFSVI	2684.2	1.9	ENSG00000048028	ENST00000003302	FQ/X	0.650	3	1:9	11	113834326-113834329	181	F	USP28	No	2.31471308	0.6224593312018547	0.07585818002124355	0.9841444919059494	39
HLA-B07:02	...R.....	31.9	0.12	KPGCKTNQL	43.2	0.17	ENSG00000169925	ENST00000371834	R/C	0.333	1	4	9	134053378	34	M	BRD3	Yes	1.86032922	0.9998937914787018	0.9999172827771484	0.9611149444887752	32
HLA-C07:02	R........	1301.2	0.9	CKTNQLQYM	2021.9	1.4	ENSG00000169925	ENST00000371834	R/C	0.333	1	1	9	

.log file

        # VERSION:  MuPeXI 1.1
        # CALL:     /usr/cbs/bio/src/MuPeXI-1.1/mupexi/MuPeXI.py -c /usr/cbs/bio/src/MuPeXI-1.1/config.ini -w -m -v /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.0 -e /usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7/file.1 -a HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03 -l 9
        # DATE:     Monday 15 of August 2016
        # TIME:     15:34:35
        # PWD:      /net/athena/usr/opt/www/webface/tmp/server/mupexi/57B1C4C90000785934C25FC7

        ----------------------------------------------------------------------------------------------------------
                                                        MuPeX
        ----------------------------------------------------------------------------------------------------------

          Reading protein reference file:            Found 99436 sequences, with 0 [9]mers
                                                           of which 0 were unique peptides
          Reading VEP file:                          Found 22 irrelevant mutation consequences which were discarded
                                                     Found 29 missense variant mutation(s) 
                                                           0 insertion(s)
                                                           0 deletion(s)
                                                           3 frameshift variant mutation(s)
                                                     In 7 genes and 32 transcripts
          Checking peptides:                         0 peptides matched a normal peptide and were discarded
                                                     0 peptides included unsupported symbols (e.g. *, X, U) and were discarded 

          Final Result:                              390 potential mutant peptides
          MuPeX Runtime:                             0:00:15.476422

        ----------------------------------------------------------------------------------------------------------
                                                        MuPeI
        ----------------------------------------------------------------------------------------------------------

          Reading through MuPex file:                Found 390 peptides of which 105 were unique
          Detecting HLA alleles:                     Detected the following 6 HLA alleles:
                                                        HLA-A31:01,HLA-A02:01,HLA-B07:02,HLA-B55:01,HLA-C07:02,HLA-C03:03
                                                        of which 6 were unique
          Running NetMHCpan 2.8:                     Analyzed 6 HLA allele(s)
                                                     NetMHCpan runtime: 0:00:05.053376

          MuPeI Runtime:                             0:00:09.374437

          TOTAL Runtime:                             0:00:31.901699

.fasta file

>DTU|ENST00000436792_184838714|M_D9Y
SAQIVSAAYKVDAGLPT
>DTU|ENST00000422105_184838714|M_D9Y
SAQIVSAAYKVDAGLPT
>DTU|ENST00000452666_184838714|M_D9Y
SAQIVSAAYKVDAGLPT
>DTU|ENST00000380779_10567210|M_S9F
AFDANTMTFAEKVLCQF
>DTU|ENST00000424463_184838714|M_D9Y
SAQIVSAAYKVDAGLPT
>DTU|ENST00000510114_69495729|M_E9K
MNIKTILDKLVQRGHEV
>DTU|ENST00000610939_10567210|M_S9F
AFDANTMTFAEKVLCQF
>DTU|ENST00000446204_184838714|M_D9Y
SAQIVSAAYKVDAGLPT
>DTU|ENST00000512583_69495729|M_E9K
MNIKTILDKLVQRGHEV
>DTU|ENST00000305107_69495729|M_E9K
MNIKTILDKLVQRGHEV
>DTU|ENST00000317552_10567210|M_S9F
AFDANTMTFAEKVLCQF
>DTU|ENST00000545540_113834326-113834329|F_FQ9:51X
FSAVIQSLNNCLNFEDLFSVIVCHKMYLKIVEVIQKREISCLCKSFSICLL
>DTU|ENST00000426319_184838714|M_D9Y
TFSEFNPYYKVDAGLPT


References


Article abstract

MuPeXI: Prediction of neo-epitopes from tumor sequencing data

Anne-Mette Bjerregaard, Morten Nielsen, Sine R. Hadrup, Zoltan Szallasi, Aron Charles Eklund

Personalization of immunotherapies such as cancer vaccines and adoptive T cell therapy depends on identification of patient-specific neo-epitopes that can be specifically targeted. MuPeXI, the Mutant Peptide Extractor and Informer, is a program to identify tumor-specific peptides and assess their potential to be neo-epitopes. The program input is a file with somatic mutation calls, a list of HLA types, and optionally a gene expression profile. The output is a table with all tumor-specific peptides derived from nucleotide substitutions, insertions, and deletions, along with comprehensive annotation, including HLA binding and similarity to normal peptides. The peptides are sorted according to a priority score which is intended to roughly predict immunogenicity. We applied MuPeXI to three tumors for which predicted MHC-binding peptides had been screened for T cell reactivity, and found that MuPeXI was able to prioritize immunogenic peptides with an area under the curve of 0.63. Compared to other available tools, MuPeXI provides more information and is easier to use. MuPeXI is available as stand-alone software and as a web server at http://www.cbs.dtu.dk/services/MuPeXI.

Supplementary Tables

Supplementary Table 1

Supplementary Table 2

Supplementary Table 3



GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0). If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: