CyloFold Help Page

Overview

CyloFold is a program for predicting the secondary structure of an RNA sequence including pseudoknots. The usage is straightforward: after pasting a sequence in FASTA format or as plain text in the text field named "Sequence Data" one can click "submit" at the bottom of the page. For sequences greater 100 nucleotides, it is recommended to save the provided job ID, in order to be able to access the status of the job submission. This can be accomplished by bookmarking the web page that appears after job submission. Once the computation is finished, the predicted secondary structure is displayed in two different formats ("bracket-notation" as well as "CT-format").

Examples

A list of available examples of RNA sequences can be found here.

Input Format FASTA

The FASTA format contains alternating the names of sequences (preceded by a > character) and the sequence data (both "ACGU" alphabet and "ACGT" alphabet are accepted by the server).

Raw sequence data

The raw sequence data consists of the characters representing the bases of the RNA strand (both "ACGU" alphabet and "ACGT" alphabet are accepted by the server).

Output format: bracket notation

The bracket notation is a character string that has a length equal to the number of residues of the sequence. Each character of the bracket notation indicates the base pairing status of the corresponding residue (for example the first character of the brackent notation describes the base pairing of the first residue of the sequence). Base pairs are indicated by opening and closing parenthesis. Residues that do not participate in base pairing are denoted by a hyphen. Pairs of base pairing residues that are part of a helix that is considered a pseudoknot interaction are indicated by matching characters starting with 'A'.

Output format: CT format

The CT format lists in separate rows the base nucleotide sequence as well as their base pairing status. The first column contains the residue number, the second column contains the residue symbol (A,C,G or U). The fifth column contains either zero or the residue number of a base pairing residue.

Approach

The CyloFold program was developed in order to facilitate the prediction of RNA secondary structures without restricting the complexity of considered pseudoknot interactions. Allowing all pseudoknots can be accomplished by making a list of complementary sequence fragments (corresponding to potential helices) and having an algorithm that decides which helices to choose for the predicted structure. Importantly, we decided to incorporate a way to assess the steric feasibility of generated partial and final predicted structures. This is necessary in order to avoid non-physical solutions that might occur by matching sequence fragments based on their length and sequence composition and not their location on the sequence. The steric feasibility is estimated by generating a highly coarse-grained 3D model. For more information, consult the authors of the program.

TOP OF PAGE SUBMIT NEW JOB

Acknowledgments

This server was developed in the research group of Dr. Bruce A. Shapiro. This server is hosted by the Advanced Biomedical Computational Science (ABCS) of the National Cancer Institute (Frederick Campus).