DTU Health Tech

Department of Health Technology

HLAAssoc - 1.0

Predicting allele association of HLA-DRB1 and HLA-DRB3/4/5 in different populations

The program is based on a large dataset of haplotype frequencies obtained by DNA-typing 2.9 million individuals divided in 21 detailed ethnicity categories and 6.59 million individuals divided in 5 broad ethnicity categories. These data are retrieved from the voluntary bone marrow donor program Be The Match Registry made by The National Marrow Donor Program (NMDP) [23806270]. 

By defining a threshold or top, the output is restricted to a certain probability measure or the top highest probabilities. If top highest probabilities is chosen, only the associated alleles with a probability of 5% or more, will be included in the output.  The server will take either donor or population data as input. Donor data will be a file containing donor sample IDs together with the DRB1 found in each donor, while the population data will be a file with DRB1 together with the frequency of finding that specific allele in the population. 



Paste your data

...or upload a local file:  


Population type:

Threshold or top:

Exclude frequencies:

Include null alleles:


The webserver HLAAssoc-1.0 takes the following inputs:
  • 1) Load input file consisting of HLA-DRB1 alleles from either different donors or a population.
    The file must be in txt-format and either separated by space or tab.
    The alleles can be written in two formats: DRB1_0101 or DRB1*01:01. In any case the output will be in the first format.
    If the file consists of donor data, first column should be donor ID while the following should be DRB1 alleles. If the file has more than three columns, only the first three will be used, since each individual can have a maximum of two HLA-DRB1 alleles.
    If the file consists of population data, the first column should be DRB1 alleles while the second should be the frequencies of seeing the alleles in the population. The population file should consist of no more than two columns.

    Example of donor input
    Download example file
    Example of population input
    Download example file

  • 2) Select population type
    Choose a population matching the input data in order to see the predicted allele associations based on the chosen population.

  • 3) Input threshold or top
    Has to be a positive number.
    Numbers below 1 will be interpreted as a threshold for the probability of seeing the specific DRB3/4/5 for that donor or population, depending on the type of input file.
    If 0, the server will show all associations, to the input alleles.
    Numbers from 1 and above defines number of associated DRB3/4/5 to be displayed, prioritizing alleles with highest probability. This is also known as the top. Any associated alleles with probability <5%, will not be shown.

The output format depends on the input data.

For input data of donor type, see figure:

  1. 1) Information about the input type, top/threshold, and the population type used for the run.
  2. 2) Link to download the output table as a txt file.
  3. 3) First column of the table containing the sample names of the donors as specified in the input file.
  4. 4) Second and third column of the table contain HLA-DRB1 allele 1 and HLA-DRB1 allele 2 respectively as specified in the input file.
  5. 5 & 7) These columns contain a list (separated with semicolon) of HLA-DRB3/4/5 alleles inferred to allele 1 (5) and allele 2 (7). NA is printed in cases where no alleles can be inferred for the allele with the specified options. The list is ordered in highest frequency.
  6. 6 & 8) These columns contain a list (separated with semicolon) with the frequencies of the inferred HLA-DRB3/4/5.
  7. (6) is a list of the frequencies matching the alleles in (5) and (8) is a list of frequencies of the alleles in (7). Frequencies are rounded to be with 3 decimal places. This will be NA in the case when there is no alleles to be inferred.
For input data of population type, see figure2:
  1. 1) Information about the input type, the top/threshold, and the population type used for the run.
  2. 2) Series of warnings stating if a input HLA-DRB1 allele cannot be found in the database used by server, hence these alleles will not be considered.
  3. 3) Link to download the output table as a txt file.
  4. 4) Column containing the HLA-DRB3/4/5 inferred to the input HLA-DRB1 alleles ranked in highest frequency.
  5. 5) Column containing frequency of the HLA-DRB3/4/5 inferred to the input HLA-DRB1 alleles. Frequencies are rounded to be with 3 decimal places.

Alleles written as "DRBX_NNNN" are empty alleles, and can be interpreted as the donor or population having HLA-DRB1 and no HLA-DR3/4/5.


Integration of HLA-DR linkage disequilibrium to MHC class II predictions

Pedersen MB, Asmussen SR, Sarfelt FM, Saksager AB, Sackett PW, Nielsen M, Barra C 

Insights into peptide binding to HLA class II molecules is essential when studying the biological mechanisms behind cellular immunity, autoimmune diseases, and the development of immunotherapies and peptide vaccines. Currently, most of the publicly available data used to train state-of-the-art binding prediction methods for HLA-DR only includes DRB1 information. The role of the paralogue alleles, HLA-DRB3/4/5, and their strong linkage disequilibrium to DRB1 is often omitted when typing HLA-II alleles. This leads to ambiguities when making disease associations and interpreting HLA-restricted immune data. To resolve this issue, we present HLAAssoc-1.0, a method to infer HLA-DRB3/4/5 alleles by linkage disequilibrium to HLA-DRB1. We illustrate the usage of the tool and the importance of the integration of HLA-DRB3/4/5 alleles in the data analysis in different case studies including the interpretation of immunopetidomics data. Additionally, we infer allele information for the data used for training of NetMHCIIpan lacking HLA-DRB3/4/5 allele information and demonstrate that the retrained method achieved improved performance. In all cases, inferring HLA-DRB3/4/5 allele presence in non-fully typed HLA-II assays resulted in improved allele and motif deconvolutions.
HLAAssoc-1.0 is available at https://services.healthtech.dtu.dk/services/HLAAssoc-1.0/


Code available for download at: https://github.com/BarraLab/HLA-Assoc


If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support: