Question

RNAseq, what to do if you have no replicates in voom-limma

0

Entering edit mode

Johnny H ▴ 80

@johnny-h-3952

Last seen 8.8 years ago

United Kingdom

Dear Limma/EdgeR users,

I have 2 treatment groups, 3x biological replicates for each. I also have 2 extra samples, a pool of each treatment group.

I am comparing a "vanilla" analysis with the biological replicates, to an analysis with the pooled samples. I.e 1 vs. 1 sample.

In the EdgeR manual, there are 4 clear ways/examples of how to do an analysis without biological replicates. This has been very useful, and there is no problem; great.

However, reading about different methods for RNAseq differential gene expression has suggested the voom - lmma is a more robust approach. E.g. less susceptible to the mean - variance relationship.

http://peterhickey.org/blog/2011/11/23/bioinf-seminar-gordon-smyth.html

In addition, a recent publication also promotes the use of voom-limma over other methods due to False Positive Rates.

http://biorxiv.org/content/early/2015/06/11/020784

Bearing that in mind, I want to compare using the biological replicates to using the pooled samples alone with voom-limma, as I am able to do with EdgeR.

Is there a way for voom-limma to "learn" the variance/dispersion/weights etc from the biological replicates I have, and then use them with the pooled samples alone?

Thank you very much.

limma voom edger • 5.5k views

ADD COMMENT • link updated 7.6 years ago by Konika Chawla ▴ 20 • written 8.8 years ago by Johnny H ▴ 80

0

Entering edit mode

Thank you for your answer and a solution to this.

Yes you are obviously right in this situation, we don't need to look at the pooled samples.

The reason we are looking at this approach, as can often be the case, is money. We have many conditions to look at, and to do 3 biological replicates of all those conditions will be very costly (there is a limit). So one idea has been to learn the dispersions and variations in the data (build an error model), and then to pool 3 biological replicates for subsequent conditions (so we can look at more conditions).

I would be interested in your view about this approach?

ADD REPLY • link updated 8.8 years ago by Gordon Smyth 50k • written 8.8 years ago by Johnny H ▴ 80

1

Entering edit mode

Well, you can do that using the method I have described.

In your situation however, I always advise my collaborators to barcode the biological samples before pooling them, and multiplex them onto the same Illumina sequencing lane. That costs almost the same as just sequencing the pool, but gives a separate FastQ file for each biological replicate.

See the related discussion to an earlier post: EdgeR: replicated pools, yes or not?

ADD REPLY • link 8.8 years ago Gordon Smyth 50k

0

Entering edit mode

Thank you very much for that.

It really is a good idea, which gives a balance between saving money and still having biological replicates, albeit fewer reads per rep. I have passed this idea on to the people who make the decisions.

ADD REPLY • link 8.8 years ago Johnny H ▴ 80

0

Entering edit mode

I also have similar situation only one replicate per sample. But I can run voom if I donot specify design and keep plot=TRUE (optional). This plots the graph for variance in different samples for each miRNA mean expression value (or similar).

I am now trying to calculate the variation separately and if the miRNA have variation more than 2 fold (or similar cutoff) from the weight (given by voom), I can select the miRNA as a candidate. Ofcourse without replicates no significance testing could be done.

Any suggestion on improving my approach is welcome.

ADD REPLY • link 7.6 years ago Konika Chawla ▴ 20

0

Entering edit mode

My suggestion to improve your approach is to use replicates ;-P

ADD REPLY • link 7.6 years ago b.nota ▴ 360

score 2 · Answer 1 · 2015-07-10

You can't use voom or limma without replicates.

You do have replicates however, so you can do a normal analysis. If you want to compare the two pooled samples, you can analyse all 8 samples together. You create four treatment groups, with the two pooled samples belonging to treatment groups 3 and 4. Then you can construct a limma contrast between groups 3 and 4 to compare the two pooled samples.

Not sure why you would want to do that though. All the useful information is contained in the 3 vs 3 comparison of the replicated treatment groups. On the face of it, there doesn't seem to be anything extra to be gained from a 1 vs 1 comparison of the pooled samples.