Question: Usability of ERCC spike-ins in standard RNAseq experiments
gravatar for mbeckste
2.0 years ago by
mbeckste20 wrote:

For a manuscript revision we were asked to redo the data analysis of a standard RNAseq experiment (two conditions, triplicates, no transcriptional amplification,~50M reads/library) with a DESeq2 normalization based on ERCC spike-ins. So far we only used the ERCC spike-in controls for addressing the technical performance of the experiment and removed the ERCC probe counts before feeding counts into DEseq2. The obtained logFC values look pretty accurate and also fit quite well with qPCR results. Following the referees suggestion, we repeated the analysis and used only the counts for the 92 ERCC probes to estimate size factors in DESeq2. This gives significantly different (and worse) results which in particular do not fit to the qPCR results as well as before. Unfortunately, we only have qPCRs for a bunch of genes and hence correlation to these would not be the strongest argument. I already read that a normalization based on 92 genes/probes is simply not 'broad' enough and tend to give worse results than using the several thousand gene counts in the biological data. I wonder if there are also other arguments against using technical spike-ins for normalization in such a scenario?



ADD COMMENTlink modified 2.0 years ago by Simon Anders3.5k • written 2.0 years ago by mbeckste20
gravatar for Simon Anders
2.0 years ago by
Simon Anders3.5k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.5k wrote:

Short answer: You don't use spike-ins in RNA-Seq normalization for the same reason as why you use an internal control gene when doing qPCR rather than adding a spike-in and using that as control.


Long answer:

The idea that spike-ins are more reliable for normalisation is based on a misunderstanding on the purpose of RNA-Seq normalisation. As it is such a common question, I better answer a bit at length.

First, note that in a typical experiment, the total amount of RNA extracted from each sample is usually of little interest. If we extract a bit more RNA from one sample than from another, this just means that there might have been a few more cells in it, which may have been caused by the treatment but also may be be because we have seeded a few more cells initally, or pipetted a bit differently or whatever. Even if it was the treatment that caused more cells to grow and hence more RNA to be yielded, this is not what we want to measure in an RNA-Seq experiment. There are other assays to measure growth.

Furthermore, the amount of RNA yielded by a sample has little to do with the amount of reads obtained form the library. 

The ratio of technical spike-ins to biological genes, however, does depend on the sample's total RNA yield, because we always spike in the exact same amount of the spike-in mix, while the biological amount varies from sample to sample.

In a typical experiment comparing treated and control samples, one hopes to find a number of genes which respond to the treatment while one assumes that a large number of genes, especially the so-called house-keeping genes, stay at the same expression level. If we know that a given gene is 10x the expression of the house-keeping gene in treated samples and only 4x in control samples, it is differentially expressed. If, however, we know that the gene's transcripts have a total amount of 8 femtomoles in one sample and only 6 fmol in another, this could as well be because there were more cells in the second sample.

Normalization by comparing to the bulk of other genes removes differences in initial total material or total number of reads, and this is usually what we want.

Normalization with technical spike-ins, however, preserves differences in starting amount, and usually, this is not what we want!

There are cases where we want to preserve information on the exact starting amount, namely, if we have ensured (e.g. by flow cytometry) that each sample contains exactly the same number of cells, and we are expressly interested not in relative but in absolute changes of transcript material. For example, if the treatment is expected to affect transcription globally, i.e., to reduce the expression of all genes simultaneously, and we want to know how strongly overall mRNA abundance goes down. (However, in this case, RNA-Seq might not be the best assay.)


Exercise question: Why did the RNA-Seq data agree well with the qPCR measurements after normalising conventionally but not after normalising with spike-ins?

Answer: Because qPCR curves are also always compared to a biological control gene (maybe actin or GAPDH or the like). When comparing two qPCR samples, we do not compare the ct values directly, but their respective differences to this housekeeping gene. If OP had used one of the spike-ins rather than one of the housekeeping genes as internal qPCR control, the spike-in-normalized RNA-Seq data would have matched better with qPCR than the conventionally matched one.

There is, of course, a reason why nobody uses spike-in in that way for qPCR: one would mainly measure the dilution of the sample rather than the expression of the target gene.


ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Simon Anders3.5k

I very much agree but I do however think there are two other uses for (ERCC-) spikeins:
1) (Extreme) Cases where there are large changes in the RNA composition (such as knock down/out of decay factors etc)
2) For selecting expression cutoffs - since we know the exact concentration of the spikeins it is quite easy to see at which approximate level our expression estimates becomes unreliable.

ADD REPLYlink written 14 months ago by kristoffer.vittingseerup20
gravatar for Ryan C. Thompson
2.0 years ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson6.9k wrote:

My main issue with normalizing based on any spike-in is not the fact that a spike-in has few probes available for normalization. Much worse, in my opinion, is that a spike-in normalization does not account for composition bias. The DESeq2 normalization is designed not only to account for sequencing depth, but also composition bias, by normalizing the samples such that the median log ratio is approximately zero between all samples (or something to that effect; I may be mis-remembering the precise details). Since spike-ins have no relation to the composition of the sample, a normalization based only on spike-ins cannot possibly account for composition bias, and so the results will be similar to running DESeq with all the size factors set to 1 (depending on how precise your spike-in was).

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Ryan C. Thompson6.9k

Also check out, which has something to say about the reliability of spike-ins.

ADD REPLYlink written 2.0 years ago by Aaron Lun21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 180 users visited in the last hour