mhcLaiaMotifs

Search for MHC-like motifs in peptide collections

Description

This utility helps in the search for MHC-like motifs in collections of peptide sequences obtained from proteomic analyses. LaiaMotifs program will search for sequence motifs characterized by an anchor position defined by the presence of one or several specific amino acids and will determine the frequency of the different amino acids in each position in the sequence to the right (C-terminal) of the anchor.

Amino acids defining the anchor and the motif size are selectable. An additional position in the peptide (relative to the anchor) can be defined as a second anchor in order to search for more restricted sequence patterns.

Results are displayed graphically and stored in a MS Excel file in the form of frequency tables (absolute and normalized values).

Graphics are also embedded in the Excel document for convenience.

The program was coded in Python (Life is short, thanks Guido), and uses wxPython for the GUI. The GUI was build with wxGlade and the installer was prepared with py2exe and Inno-Setup.

This program has been written by Joaquin Abian (LP CSIC/UAB) in collaboration with Laia Muixí (IBB) (Autonomous University of Barcelona, Catalonia, Spain).

Installation

From Installer

  1. Download the windows installer provided in the LaiaMotifs project download page.
  2. Double click on it and follow the Setup Wizard.
  3. That is!

From Source

  1. Install Python and third party software indicated in Dependencies.
  2. Download mhcLAIAmotifs source from its Mercurial Source Repository.
  3. Download commons source code from its Mercurial Source Repository.
  4. Copy the folder anywhere in your path.

Source Dependencies:

Third-party program versions correspond to those used for the installer available here. Lower versions have not been tested, although they may also be fine.

Usage

  1. Use the Browse button to find and load a Sequences Input File (a text file containing peptide sequences, one per line, single letter code).
  2. Set Anchor AAs, the amino acids allowed in the anchor position (ex. FYW for MHC class-I).
  3. Set Size, the peptide motifs length (ex. 9 for MHC class-I).
  4. Check Non Nested if you want to select only non-nested sequences.
  5. Check Two Anchors if you want to select a subset of sequences with selected amino acids in a second position and set the allowed amino acids (Pattern AAs) and the position (pos).
  6. Run Search. Frequency data is shown in the Text window; and 2 image windows are opened, with the graphical representation of amino-acid frequencies. Moreover, the full data set is automatically stored in a MS Excel file (.xls)
  7. Press the Save button located in each image window if you want to keep the images produced
  8. Press Open XlS to open the MS Excel file and look at the full data set produced including images

Three frequency tables can be found in the MS Excel document:

  • Absolute counts for each amino acid in each position.
  • Normalized -percentage- frequencies.
  • Corrected frequencies that take into account the expected frequency for each amino acid in a given proteome.

To calculate the corrected frequencies, frequencies for vertebrates are used by default. If you want to change it, see How To Set Specific AA Frequencies below.

The Sequences Input File

Laia Motifs analyses peptide sequence sets that are provided in an input text file, one sequence per line, single letter code. Comments can be inserted by commenting the line with the tag #.

A test sequence file mhc.txt is provided in the test folder.

How To Set Specific AA Frequencies

Laia Motif corrects the observed frequencies of aminoacids with the naturally observed frequencies in vertebrates. This default frequency data is taken from: http://www.tiem.utk.edu/bioed/webmodules/aminoacid.htm

These frequencies can be modified by editing the frecuencias.txt file:

  • To use other frequencies create/edit the frecuencias.txt files. Each line of this text file must be of the type aminoacid:frecuency:
 C:0.110
 A:0.074
  • Use the tag name at the first line in the file to indicate the data source:
 name:frog
  • Start any line with the tag # for comments or to hide the data from the aminoacid in this line:
 #This is a comment...
  • If an aminoacid is not indicated in the file or it is hidden (commented out), its default frequency in vertebrates will be used.

Download

You can download the last version of the LAIA motifs Program, developed at LP CSIC/UAB, clicking this download link Download.

After downloading, you have to e-mail us at logo3 to get you free password and unlock the installation program.