Long noncoding RNAs (lncRNAs) are noncoding RNAs with transcript length more than 200 nucleotides. Although poorly conserved, lncRNAs are expressed across diverse species, including plants and animals, and are known to be involved in regulation of various biological processes. To understand their biological significance, we first need to identify the lncRNAs accurately. However, distinguishing lncRNAs from coding transcripts is still a challenging task.
Researchers at Jawaharlal Nehru University have developed a machine learning-based approach to accurately identify the plant lncRNAs. They describe the usage of plant long noncoding RNA prediction by random forests (PLncPRO), which employs machine learning-based random forest algorithm to recognize the lncRNAs from the set of given transcript sequences. Stepwise instructions have been provided to use PLncPRO to annotate the lncRNA sequences.
Availability – http://ccbb.jnu.ac.in/plncpro/