Artificial spike-ins could improve accuracy of plant RNA-seq analysis

Researchers at NC State have published a simple trick that improves the accuracy of techniques that help us understand how external variables – such as temperature – affect gene activity in plants.

“There are really two contributions here,” says Colleen Doherty, corresponding author of a paper on the work and an associate professor of molecular and structural biochemistry at North Carolina State University. “First, we’re raising the visibility of a problem that many of us in the plant research community were unfamiliar with, as well as highlighting the solution. Second, we’ve demonstrated that addressing this problem can make a significant difference in our understanding of gene activity in plants.”

At issue is a technique called RNA-seq analysis, which is used to measure changes in gene activity – i.e., when genes are actively transcribing to produce proteins.

“We use RNA-seq analysis to assess how plants respond to various stimuli, or changes in their environment,” Doherty says. “It’s used widely because it’s a relatively easy and inexpensive way to monitor plant responses.”

For example, researchers can use RNA-seq analysis to see which genes are turned on when a plant is experiencing drought conditions, which then informs the development of new plant varieties that are drought resistant.

But there’s a specific challenge related to RNA-seq analysis, which Doherty and her collaborators ran into by accident.

“We were monitoring how plants respond to different temperatures at multiple times of day, and the results we got were wildly divergent,” Doherty says. “We initially thought we might be doing something wrong. But when we began looking into it, we learned that animals and yeasts are known to have global changes in transcription based on variables such as the time of day or nitrogen deprivation.”

In other words, researchers want to see how specific variables – such as increased temperature – affect transcription in specific genes. But there are some variables – like time of day – that can increase or decrease transcription in all the genes. This can throw off researchers’ ability to draw conclusions about the specific variables they want to study.

Challenges in RNA-Seq analysis

Details are in the caption following the image

(a) Commonly used normalization methods assume that only a small proportion of transcripts are differentially expressed between conditions (small, dashed inner circle in Conditions A and B). It is assumed that most transcripts do not change in expression across experimental conditions (as indicated by the solid outer circle in Conditions A and B), resulting in stable transcript pool. However, some experiments do affect global transcription, either increasing (dashed red circle) or decreasing (dashed blue circle) the size of the RNA pool. (b) A change in the proportional share of mRNA in the pool could also affect the identification of DEGs. When some genes substantially increase in gene expression (e.g., Gene group 4), commonly used normalization methods that assume no change in global expression would result in artificially reducing the expression of other genes (e.g., groups 1–3) that do not change in expression.

“Luckily, we found that this problem is sufficiently well-established among researchers who work on non-plant species that they have developed a method to account for it, called an artificial spike-in,” Doherty says. “These and similar techniques have been used in plant science in other contexts and when using older techniques and technologies. But for whatever reason, our field didn’t incorporate artificial spike-ins into our methodology when we adopted RNA-seq analysis.”

Artificial spike-ins make use of pieces of foreign RNA that are unlike anything in the plant’s genome, meaning that the foreign RNA will not be confused with anything the plant itself produces. Researchers introduce the foreign RNA into the analysis process at the beginning of the experiment. Because global changes in transcription will not affect the foreign RNA, it can be used as a fixed benchmark that allows researchers to determine the extent to which there is an overall increase or decrease in RNA that the plant itself is producing.

“When we used artificial spike-ins to account for global changes in transcription, we found that the differences in plants exposed to temperature changes at different times of day were actually even greater than we anticipated,” Doherty says.

“The artificial spike-in gave us more accurate information and greater insight into how plants are behaving at night – since we found that global transcription was higher at night. Before we adopted the use of artificial spike-ins, we were missing a lot of what was happening at night.

“Artificial spike-ins are an elegant solution to a challenge many of us in the plant research community didn’t even know was there,” Doherty says. “We’re optimistic this technique will improve the accuracy of transcriptional analysis in the wide variety of conditions that can affect global transcription in plant species. And that, in turn, may help our research community garner new insights into the species we study.

“We didn’t develop this solution – artificial spike-ins – but we really hope it garners more widespread use in plant science.”

SourceNC State University

Laosuntisuk K, Vennapusa A, Somayanda IM, Leman AR, Jagadish SK, Doherty CJ. (2024) A normalization method that controls for total RNA abundance affects the identification of differentially expressed genes, revealing bias toward morning-expressed responses. Plant J [Epub ahead of print. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.