Usability of ERCC spike-ins in standard RNAseq experiments
2
3
Entering edit mode
mbeckste ▴ 30
@mbeckste-11696
Last seen 8.1 years ago

For a manuscript revision we were asked to redo the data analysis of a standard RNAseq experiment (two conditions, triplicates, no transcriptional amplification,~50M reads/library) with a DESeq2 normalization based on ERCC spike-ins. So far we only used the ERCC spike-in controls for addressing the technical performance of the experiment and removed the ERCC probe counts before feeding counts into DEseq2. The obtained logFC values look pretty accurate and also fit quite well with qPCR results. Following the referees suggestion, we repeated the analysis and used only the counts for the 92 ERCC probes to estimate size factors in DESeq2. This gives significantly different (and worse) results which in particular do not fit to the qPCR results as well as before. Unfortunately, we only have qPCRs for a bunch of genes and hence correlation to these would not be the strongest argument. I already read that a normalization based on 92 genes/probes is simply not 'broad' enough and tend to give worse results than using the several thousand gene counts in the biological data. I wonder if there are also other arguments against using technical spike-ins for normalization in such a scenario?

 

 

RNAseq deseq2 ercc • 16k views
ADD COMMENT
18
Entering edit mode
Simon Anders ★ 3.8k
@simon-anders-3855
Last seen 4.3 years ago
Zentrum für Molekularbiologie, Universi…

Short answer: You don't use spike-ins in RNA-Seq normalization for the same reason as why you use an internal control gene when doing qPCR rather than adding a spike-in and using that as control.

 

Long answer:

The idea that spike-ins are more reliable for normalisation is based on a misunderstanding on the purpose of RNA-Seq normalisation. As it is such a common question, I better answer a bit at length.

First, note that in a typical experiment, the total amount of RNA extracted from each sample is usually of little interest. If we extract a bit more RNA from one sample than from another, this just means that there might have been a few more cells in it, which may have been caused by the treatment but also may be be because we have seeded a few more cells initally, or pipetted a bit differently or whatever. Even if it was the treatment that caused more cells to grow and hence more RNA to be yielded, this is not what we want to measure in an RNA-Seq experiment. There are other assays to measure growth.

Furthermore, the amount of RNA yielded by a sample has little to do with the amount of reads obtained form the library. 

The ratio of technical spike-ins to biological genes, however, does depend on the sample's total RNA yield, because we always spike in the exact same amount of the spike-in mix, while the biological amount varies from sample to sample.

In a typical experiment comparing treated and control samples, one hopes to find a number of genes which respond to the treatment while one assumes that a large number of genes, especially the so-called house-keeping genes, stay at the same expression level. If we know that a given gene is 10x the expression of the house-keeping gene in treated samples and only 4x in control samples, it is differentially expressed. If, however, we know that the gene's transcripts have a total amount of 8 femtomoles in one sample and only 6 fmol in another, this could as well be because there were more cells in the second sample.

Normalization by comparing to the bulk of other genes removes differences in initial total material or total number of reads, and this is usually what we want.

Normalization with technical spike-ins, however, preserves differences in starting amount, and usually, this is not what we want!

There are cases where we want to preserve information on the exact starting amount, namely, if we have ensured (e.g. by flow cytometry) that each sample contains exactly the same number of cells, and we are expressly interested not in relative but in absolute changes of transcript material. For example, if the treatment is expected to affect transcription globally, i.e., to reduce the expression of all genes simultaneously, and we want to know how strongly overall mRNA abundance goes down. (However, in this case, RNA-Seq might not be the best assay.)

 

Exercise question: Why did the RNA-Seq data agree well with the qPCR measurements after normalising conventionally but not after normalising with spike-ins?

Answer: Because qPCR curves are also always compared to a biological control gene (maybe actin or GAPDH or the like). When comparing two qPCR samples, we do not compare the ct values directly, but their respective differences to this housekeeping gene. If OP had used one of the spike-ins rather than one of the housekeeping genes as internal qPCR control, the spike-in-normalized RNA-Seq data would have matched better with qPCR than the conventionally matched one.

There is, of course, a reason why nobody uses spike-in in that way for qPCR: one would mainly measure the dilution of the sample rather than the expression of the target gene.

 

ADD COMMENT
0
Entering edit mode

I very much agree but I do however think there are two other uses for (ERCC-) spikeins:
1) (Extreme) Cases where there are large changes in the RNA composition (such as knock down/out of decay factors etc)
2) For selecting expression cutoffs - since we know the exact concentration of the spikeins it is quite easy to see at which approximate level our expression estimates becomes unreliable.

ADD REPLY
0
Entering edit mode

You are saying that "For example, if the treatment is expected to affect transcription globally, i.e., to reduce the expression of all genes simultaneously, and we want to know how strongly overall mRNA abundance goes down. (However, in this case, RNA-Seq might not be the best assay.)" Why is RNAseq not the best assay? which assays are better?

ADD REPLY
2
Entering edit mode

If you're cheap, you could just measure the RNA concentration with a Nanodrop (or Bioanalyzer, or whatever your tool of choice is). Divide the concentration by the number of cells in your sample to get the RNA content per cell. This allows you to quantify changes in RNA content between conditions - job done.

Or you could spike-in RNA proportional to the number of cells in your sample, and do standard RNA-seq on the resulting mixture of spike-in and endogenous RNA. Normalization based on the spike-in coverage will preserve differences in total RNA content that would be lost with standard methods. Or, if you can't be bothered getting an accurate measure of the number of cells, you could do single-cell RNA-seq and just add the same amount of spike-in RNA to each cell. Bit more expensive but sexier.

Of course, all of this is discussing changes in RNA content. If you want specifically changes in global transcription (i.e., creation of new transcripts), then the whole thing becomes harder. I guess you'd have to use a variant of the protocols used to capture nascent RNAs, e.g., GRO-seq.

ADD REPLY
0
Entering edit mode

You are saying that "For example, if the treatment is expected to affect transcription globally, i.e., to reduce the expression of all genes simultaneously, and we want to know how strongly overall mRNA abundance goes down. (However, in this case, RNA-Seq might not be the best assay.)" Why is RNAseq not the best assay? which assays are better?

ADD REPLY
3
Entering edit mode
@ryan-c-thompson-5618
Last seen 6 weeks ago
Icahn School of Medicine at Mount Sinai…

My main issue with normalizing based on any spike-in is not the fact that a spike-in has few probes available for normalization. Much worse, in my opinion, is that a spike-in normalization does not account for composition bias. The DESeq2 normalization is designed not only to account for sequencing depth, but also composition bias, by normalizing the samples such that the median log ratio is approximately zero between all samples (or something to that effect; I may be mis-remembering the precise details). Since spike-ins have no relation to the composition of the sample, a normalization based only on spike-ins cannot possibly account for composition bias, and so the results will be similar to running DESeq with all the size factors set to 1 (depending on how precise your spike-in was).

ADD COMMENT
1
Entering edit mode

Also check out http://dx.doi.org/10.1038/nbt.2931, which has something to say about the reliability of spike-ins.

ADD REPLY

Login before adding your answer.

Traffic: 740 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6