Peptides and MHC

This page lists a few useful tools and websites for handling peptides and MHC and related topics

Epitope Frequency Assessment


Link: http://epi.dagitty.net/ An online tool written by our own Johannes Textor. Find predicted epitopes on multiple proteins and compare frequencies with expected frequencies. Visualize epitope-rich regions on each protein.

netMHCpan


The local version of NetMHCpan can be found at
$ cd  /tbb/local/ii/bin/
The most recently installed version at the moment is
$ /tbb/local/ii/bin/netMHCpan-3.0
Typing this command will give you a list of command line options.

Calling netMHCpan from python


The function funWithNetMHCpan can be used in combination with e.g. multiprocessing.Pool
import subprocess ## for subprocess.call
 
def funWithNetMHCpan(infilename, hlaString, attr=""):
    ## infilename must have extension ".fasta"
    ## the hlaSting is a comma seperated list of MHC alleles. Remove the *s from the allele names.
    netMHCpanCmd = "/tbb/local/ii/bin/netMHCpan-2.8" ## depends on the version that you want to use
    outfilename = infilename[:-6] + attr + ".xls"
    errfilename = infilename[:-6] + attr + ".err"
    logfilename = infilename[:-6] + attr + ".log"
    cmdlist = [netMHCpanCmd, ## netmchpan command
            "-f",infilename,
            "-xls","-xlsfile",outfilename,
            "-a",hlaString,
            "-l","9"] ## restrict to 9-mers (add "-p" for lists of peptides)
    retcode = subprocess.call(["echo", "calling " + cmdlist[0] + " with input " + cmdlist[2]])
    logfile = open(logfilename, "w")
    errfile = open(errfilename, "w")
    retcode = subprocess.call(cmdlist, stdout = logfile, stderr = errfile)
    ## close the logfile
    logfile.close()
    errfile.close()
    subprocess.call(["echo", "complete!"])
    return retcode



Thresholds for pMHC binding for and from netMHCpan


netMHCpan returns a list of affinities. In order to distinguish between which peptides bind, and which don't, often the 500 nM (weak binders) or 50 nM (strong binders) threshold is used. However, this will result in some MHC molecules having (much) more binders than others. To avoid this, Hanneke van D. created a list of MHC specific thresholds under the assumption that they should on average all present the same number of peptides. This list was produced by predicting the affinity for a large number (10^5) of naturally occurring peptides, and then choosing the threshold affinity such that 1% of all peptides would be a binder.

This list can be found here:


Notice that the thresholds are not given in nM, but transformed with $x \mapsto 1 - \log(x)/\log(5\cdot 10^4)$. (TODO latex code in wikispaces. Somebody knows how?)

The choice for 1% is reasonable (otherwise some HLA molecules are at risk of not binding any peptides from small viruses), but may also lead to a lot of predicted epitopes for larger viruses. Therefore Chris van D. (me) computed affinity thresholds for a couple of more "percentages".

That list can be found here:


Notice that instead of percentages, the list's header lists fractions.
Just to make sure, here is a figure comparing the 1% thresholds from the two lists:
check-1p-thresholds.png

The percentage-threshold relations are not completely trivial. You can't just scale all thresholds in a simple way when you want to go from 1% to 0.1%-say. This is shown in the following figure:
percentage-threshold-dependence-HLA-ABC.png

Every line in this figure corresponds to the percentage-threshold relation of one particular HLA molecule. Many lines intersect.

netMHCstabpan

A local version of netMHCstabpan can be found at
$ /tbb/local/ii/bin/netMHCstabpan-1.0