Identification of Novel Small RNA Molecules in the Escherichia coli Genome: from in silico to in vivo

Hershberg R, Argaman L, Vogel J, Bejerano G, Wagner EGH, Altuvia S, Margalit H.
Hebrew University of Jerusalem

Small, untranslated RNA molecules exist in all kingdoms of life. These RNAs carry out diverse functions and many of them are regulators of gene expression. Genes encoding small RNAs (sRNAs) are difficult to detect experimentally or to predict by traditional sequence analysis approaches. Thus, in spite of the importance of these molecules, many of the sRNAs known to date were discovered fortuitously. We developed a computational strategy to search the Escherichia coli genome for genes encoding small RNAs. Our method was based on the transcription signals and genomic features, such as location and conservation, that characterize the 10 known sRNAs in E. coli. The search was limited to regions of the genome in which no gene existed on either strand. These regions were searched for transcriptional signals (promoter sequences recognized by the major sigma factor of E. coli RNA polymerase (s 70), and Rho-independent terminators). Sequences for which the distance between the predicted promoter and terminator was 50-400 bases were compared to genome sequences of other bacteria. Sequences with good conservation were predicted as sRNAs. 23 of the predicted genes were tested experimentally, out of which 17 were shown to be expressed in E. coli. The newly discovered sRNAs showed diverse expression patterns and most of them were abundant.