banner-button_0 banner-button_Layer-7 banner_button_03 banner-button_Layer-4 banner-button_05 banner-button_Layer-5 banner-button_07
banner-button_08 banner-button_09 banner-button_10
Bioinformatics Unit banner
   tabfoot tabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bg

Seeing the Trees through the Forest:
Sequence-based Homo- and Heteromeric
Protein-protein Interaction sites prediction
using Random Forest

This is the download page for our work described in the following paper. Please read it before using our tool, and please cite it afterwards :-)

Qingzhen Hou, Paul De Geest, Wim F. Vranken1, Jaap Heringa and K. Anton Feenstra (2017). Seeing the Trees through the Forest: Sequence-based Homo- and Heteromeric Protein-protein Interaction sites prediction using Random Forest. Bioinformatics, doi: 10.1093/bioinformatics/btx005, 2017
1Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB & Structural Biology Brussels, VUB & Structural Biology Research Centre, VIB; Brussels 1050, Belgium.

We have created a web-server "SeRenDIP — SEquence-based Random forest predictor with lENgth and Dynamics for Interacting Proteins" which allows you to try out your queries of interest. However, due to the two-tiered blast searches (we need profiles for each hit in the first blast search), runtimes can be quite substantial on our server. If you need more performance, you may instead rather want to use the stand alone version supplied in the download link below.

As a note of caution, we've recently discovered that some of the R scripts included in the main archive, turn out to be quite sensitive to the R version used. We know it works with 2.15.0 (which is what the webserver runs on), and it doesn't work with at least one older version of R. We do not know if newer R versions work correctly or not. Critical point here, is that with the wrong R version, results are produced but they are incorrect. You're best off to check the prediction from your 'home installation' for one or a few proteins against our webserver.

Download Description MD5
Hou_etal_RF-PPI_predictors_datasets.tar.gz Archive containing README, test R script, predictors and test datasets. (836MB) 694ea6804aebc2d5690b23ffb8c81098
Hou_etal_RF-PPI_hm_476_test_train_5fold.tar.gz Archive containing training datasets for the homodimer proteins, based on Hou, et al., BMC Bioinformatics 16:325, 2015. (434MB) 783d2b4ca7f305f3603beb90bfe24a33
Hou_etal_RF-PPI_dset_119_train.tar.gz Archive containing training datasets for the heteromeric proteins, based on Murakami and Mizuguchi, Bioinformatics, 26:1841–1848, 2010. (14.2MB) 7b708dcf7767492730e5ff9f17957aeb README file separately.
GNU General Public License v3.0 - GNU Project - Free Software Foundation (FSF).txt GNU GPL v3.0 - Free Software Foundation (FSF) separately.

    These scripts and datafiles are provided as-is, and come without 
    any warranty whatsoever. If it works for you, great, let me know!
    If it doesn't work for you, we'd be happy to try and help you fix it. 
    If it destroys your universe, too bad (you may still file a bug report).
    Copyright (c) 2016 Q. Hou, 
                       P. De Geest, 
                       W.F. Vranken, 
                       J. Heringa, 
                       K. Anton Feenstra

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see

(c) IBIVU 2019. If you are experiencing problems with the site, please contact the webmaster.