DisoRDPbind - predictor of disorder-mediated RNA, DNA and protein binding regions

DisoRDPbind webserver

DisoRDPbind predicts the RNA-, DNA-, and protein-binding residues located in the intrinsically disordered regions. DisoRDPbind is implemented using a runtime-efficient multi-layered design that utilizes information extracted from physiochemical properties of amino acids, sequence complexity, putative secondary structure and disorder, and sequence alignment.

Please follow the three steps below to make predictions:

1. Upload a file with protein sequences, or paste them into text area

Server accepts up to 5000 (FASTA formated) protein sequences. Either upload a file or enter each protein in a new line in the following text field (see Help for details):

2. Provide your e-mail address (required)

Please provide your e-mail address to be notified when results are ready.

3. Predict:

Click button to launch prediction.

Materials

    Datasets used to design and evalaute DisoRDPbind:
  • TRAINING dataset - Dataset used to design DisoRDPbind.
  • TEST36 dataset - Dateset with new depositions from the DisProt database.
  • TEST115 dataset - Dateset with sequences that share low similarity to sequences in TRAINING dataset.
    These datasets are in the following format:
  • Line 1: >protein ID: the protein identifier used in DisProt
  • Line 2: protein sequence (1-letter amino acid encoding)
  • Line 3: per residue annotation of disordered RNA-binding
  • Line 4: per residue annotation of disordered DNA-binding
  • Line 5: per residue annotation of disordered protein-binding

The annotations are encoded as follows: "1" denotes residues annotated with the particular type of binding, "0" denotes residues with annotations that include other types of disordered or ordered residues, and "x" denotes the residues that lack annotations.

    Supplement:
  • The supplementary information can be found in the Supplement.pdf

Help

DisoRDPbind accepts either single or multiple protein sequences and the input is limited to 5000 protein sequences at the time. The user should submit the protein sequence(s) in FASTA format.

The format of the input file is as follows (Here is an example of input file where DisoRDPbind takes approximately 5 minutes to predict 500 proteins with average size of 300 AAs.):

  • Line1: >protein ID
  • Line2: protein sequence (1-letter amino acid encoding)

Acknowledgments

We acknowledge with thanks the following software used as a part of this server:

  • BLAST - Alignment to proteins with annotated functions
  • IUPred - Prediction of Intrinsically Unstructured Proteins
  • PSIPRED - Prediction of secondary structure
  • SEG - Application for the prediction of low complexity regions (LCRs)