================================================================
 RSSVM (RNA Sampler + SVM): A Support Vector Machine based RNA Motif Identifier
   
(version 1.0, Aug 2008)
 

   Copyright (C) 2005-2007

    Xing Xu, Yongmei Ji, Gary D. Stormo
  Department of Genetics, Washington University in Saint Louis
================================================================

 

COPYRIGHT:
----------


This RSSVM software is written by Xing Xu at the Gary Stormo lab in Department of Genetics, Washington University, School of Medicine.

RSSVM is distributed freely to the scientific community "as is" in the hope that it will be useful, but without any warranty.  If you are a commercial user or want to commercialize it, please contact the authors.

 

INTRODUCTION:
-------------


RSSVM stands for RNA Sampler + Support Vector Machine, which is a new computational program developed by Xing Xu, Yongmei Ji, and Gary Stormo at Washington University in Saint Louis.

RSSVM employs Support Vector Machines (SVM) for efficient identification of functional RNA motifs from random RNA structures. It uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler (check its detail here), a tool for accurate RNA common structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities.

The details of the algorithm was described in the paper "Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines" (PLoS Computational Biology).

The program is written in Perl and C. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast and efficient pipeline for large-scale discovery of regulatory RNA motifs.

To better understand how RSSVM works, please read the paper.
Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

 

INSTALLATION:
-------------

 

Programs required for running RSSVM

1. Install RNAz (This RNAz is a modified version to calculate the z-score of a set of sequences).

   Download RNAz_modified.

2. Install LIBSVM.

   Download the latest LIBSVM at the author's website .

3. Install RNA Sampler.

   Download RNA Sampler from the next door O - O!

4. Install Perl-5.8.8 or above.

Thanks the authors of RNAz, LIBSVM and RNA Sampler for making their codes available.


Install RSSVM:

To compile this package you need a C compiler.

1.  Download this RSSVM-v1.0.tar package, and unzip it to a local directory $DIR

    >tar -xzf RSSVM-v1.0.tar

2.  Edit the driver file "RSSVM_driver.pl" to update the paths for
    RNAz, LibSVM and RSSVM.

    In the driver script "RSSVM_driver.pl", edit the following lines
    "
    my $RNAZ_PTH   = "your path for RNAz modified version";
    my $LIBSVM_PTH = "your path for LibSVM";
    my $RSSVM_PTH  = "your path for RSSVM";
    "

3.  Test if RSSVM works

    >perl RSSVM_driver.pl

    If the Usage messages are output, it means the installation is correct, 
    otherwise go back and check the installation steps.

 

USAGE:
------

Usage: RSSVM_driver.pl [-p path] [-q fasta file]
 

Required Parameters:
    [-p <Absolute path of input sequence file and
         corresponding basepairing probability matrix files>]
    [-q <name of the input fasta file>]
 

EXAMPLES:
---------

In the /example directory, there is an example output file of RNA Sampler:
tRNA.RSout.

Go to the directory RSSVM_V1.0/example, see "*.cmdline" files for some examples
of running RSSVM at command line.

   >../RSSVM_driver.pl -p /home2/xingxu/tools/RSSVM-V1.0/example
                       -q tRNA.RSout

   Tips:
   The sequence name of the each sequence in fasta file should have no space in it. 
 

 

RESULTS Example:
----------------

 
# RSSVM 1.0 #

Input file: /home2/xingxu/public_html/RSSVM/data/RSSVM_V1.0/example/tRNA.RSout
N   =      6 # Number of Sequences
ID  = 40.830 # Mean pairwise identity
Z   = -2.027 # Mean z score
SCI =  0.864 # Structure conservation index based on common structures predicted by RNA Sampler
I   =  0.770 # Information content of alignments of stem regions
MI  =  0.655 # Mutual information of alignments of stem regions
SVM RNA-class probability: 1.000   # The P-value of SVM classification which can be used as a cutoff
Prediction: RNA                    # P-value > 0.5  (More stringent P-value cutoffs can be used to
                                     reduce false positives)

 

 

 

 

 

 

 

BUGS, QUESTIONS, COMMENTS:
--------------------------


Please email questions, comments and suggestions to:

Xing Xu, xingxu AT genetics.wustl.edu,
Yongmei Ji, yji AT genetics.wustl.edu,
Gary D. Stormo, stormo AT genetics.wustl.edu.

RSSVM is still under active development.  Please check its latest version
at http://ural.wustl.edu/~xingxu/RSSVM/

 

A stand alone version of RSSVM and its web server is coming soon