As many structures of protein-RNA complexes have been known in the past years, several computational methods have been developed to predict RNA-binding sites in protein. However, its inverse problem (i.e., predicting protein-binding sites in RNA) has received much less attention. Furthermore, most of the methods that predict RNA-binding sites in protein do not consider interaction partners (i.e., RNAs) of a protein, so they always predict the same binding sites for a given protein sequence even if the protein binds to different RNAs. Here we describe a web server called PRIdictor (Protein-RNA Interaction predictor) which predicts protein-binding sites in RNA as well as RNA-binding sites in protein with consideration of interaction partners of protein and RNA.

We obtained recent structures of protein-RNA complexes from the Protein Data Bank (PDB) solved by X-ray crystallography with a resolution of 3.0 Å or better. Using three types of interactions (hydrogen bonds, water bridges and hydrophobic interactions) between protein and RNA, we extracted binding sites and computed the interaction propensity (IP) between nucleotides and amino acids.

When both protein and RNA sequences are entered, PRIdictor predicts binding sites in both protein and RNA sequences with the assumption that they interact each other. When there is no information on interaction partners of protein or RNA, PRIdictor is still capable of predicting binding sites in a given protein or RNA sequence. Providing both protein and RNA sequences usually results in better prediction performance than providing either protein or RNA sequence alone.

With an independent dataset of 155 RNA sequences (1,848 protein-binding nucleotides and 4,631 non-binding nucleotides) and 130 protein sequences that were not used in cross-validation, PRIdictor predicted protein-binding nucleotides with a sensitivity of 72.8%, a specificity of 71.7%, an accuracy of 72.0%, a PPV of 50.7%, a NPV of 86.9% and a MCC of 0.41. With another independent dataset of 46 protein sequences (923 RNA-binding residues and 7,578 non-binding residues) and 44 RNA sequences that were not used in cross-validation, PRIdictor predicted RNA-binding residues with a sensitivity of 68.1%, a specificity of 69.2%, an accuracy of 69.1%, a PPV of 21.2%, a NPV of 94.7% and a MCC of 0.24.

PRIdictor supports a web service as well as web application. To the best of our knowledge, PRIdictor is the first web server that predicts both protein-binding nucleotides and RNA-binding residues from sequence data alone with consideration of interaction partners.