Search
Question: tximport recommendation for limma-trend downstream analysis?
1
gravatar for Jenny Drnevich
7 weeks ago by
Jenny Drnevich1.9k
United States
Jenny Drnevich1.9k wrote:

Hi there,

I was looking through the vignette for tximport, and it has recommendations for how to import data for downstream analysis in edgeR, DESeq2 and limma-voom, but it does not mention the lesser-used limma-trend. The edgeR method stores the length corrections in y$offset, but the voom() function does not use the y$offset so tximport recommends importing either "scaledTPM" or "lengthScaledTPM". The limmaUsersGuide() suggests doing logCPM <- cpm(y, log = TRUE, prior.count = 3). I thought that since cpm() is an edgeR function it would use the y$offset, but looking at the code of cpm.DGEList, it doesn't use y$offset either. So am I correct in assuming that I should use the tximport method for limma-voom, but then should use cpm() instead of voom()?

Thanks,

Jenny

ADD COMMENTlink modified 7 weeks ago by Aaron Lun17k • written 7 weeks ago by Jenny Drnevich1.9k
1
gravatar for Ryan C. Thompson
7 weeks ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson6.1k wrote:

You can force cpm to use the offset matrix by passing exp(offset) as the library size. For cpm, I don't think this should make any difference relative to using lengthScaledTPM without offsets, so the only advantage is being able to use the same tximport run for edgeR, DESeq2, and limma.

For voom, I've written a custom version that uses an offset matrix in place of the normalized library sizes: https://github.com/DarwinAwardWinner/CD4-csaw/blob/master/scripts/utilities.R#L254-L390 (Although now that I think about it, it should be possible to modify voom to accept a matrix-like lib.size argument just like cpm, instead of having a separate function for it.) I don't know that it's optimal or handles every edge case, but it has been working for me.

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by Ryan C. Thompson6.1k

Looks like everyone answered all at once. Nice to see that the Americans are bright-eyed and bushy-tailed!

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Aaron Lun17k
1
gravatar for Aaron Lun
7 weeks ago by
Aaron Lun17k
Cambridge, United Kingdom
Aaron Lun17k wrote:

One way of computing log-CPMs with offsets is to do something like this:

cpm(y$counts, lib.size=exp(y$offset), log=TRUE, prior.count=3)

... assuming your offsets are on a scale that is interpretable as the log-library size. This is what edgeR assumes the offsets to be, check out ?scaleOffset.

Also, I assume that the length corrections occur between samples, rather than between genes. edgeR will mostly ignore systematic differences in the sizes of the offsets between genes.

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by Aaron Lun17k

Thanks Aaron, yes the correction is just across samples (per gene).

ADD REPLYlink written 7 weeks ago by Michael Love14k

Thanks, everyone! I like the idea of directly giving the counts and exp(y$offset) as the lib.size in cpm() rather than lengthScaledTPM because my next question was going to be if prior.count = 3 was too large for lengthScaledTPM values, which sum to 1 million as opposed to normalize library sizes which are ~20-50 million. 

Best,

Jenny

ADD REPLYlink written 7 weeks ago by Jenny Drnevich1.9k
1

Check out the reference for tximport. Counts from abundance are on the count scale, and add up to the original library size, not 1e6. They can be thought of as counts but where changes in average transcript length across samples has been divided out.

ADD REPLYlink written 7 weeks ago by Michael Love14k
Good to know! I probably would have figured that out once I had data in hand. I’ve got 4-6 transcriptome assemblies + Salmon counts coming in soon, so I’ll get lots of practice with tximport. Thanks for a great package!! Jenny From: Michael Love [bioc] [mailto:noreply@bioconductor.org] Sent: Friday, September 29, 2017 11:01 AM To: Zadeh, Jenny Drnevich <drnevich@illinois.edu> Subject: [bioc] C: tximport recommendation for limma-trend downstream analysis? Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""> User Michael Love<https: support.bioconductor.org="" u="" 5822=""/> wrote Comment: tximport recommendation for limma-trend downstream analysis?<https: support.bioconductor.org="" p="" 100969="" #100979="">: Check out the reference for tximport. Counts from abundance are on the count scale, and add up to the original library size, not 1e6. They can be thought of as counts but where changes in average transcript length across samples has been divided out. ________________________________ Post tags: tximport, limma, limma-trend, edgeR You may reply via email or visit C: tximport recommendation for limma-trend downstream analysis?
ADD REPLYlink written 7 weeks ago by Jenny Drnevich1.9k
0
gravatar for Michael Love
7 weeks ago by
Michael Love14k
United States
Michael Love14k wrote:

Note in the tximport vignette that instead of count + offset as input to limma-voom, the hand off is to use countsFromAbundance="lengthScaledTPM" to get around the offset.

ADD COMMENTlink written 7 weeks ago by Michael Love14k

Sorry, to be more explicit, countsFromAbundance avoids the need to offset for the average transcript length bias altogether. See the tximport reference for more details.

ADD REPLYlink written 7 weeks ago by Michael Love14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 171 users visited in the last hour