Question: Bias correction in single end experiment for DEG and DET
0
gravatar for yohann.nedelec
2.9 years ago by
yohann.nedelec0 wrote:

Hello,

I'd like to get some advice about analyses I'd like to improve.

I'm concerned about bias (GC in particular) when comparing transcripts and gene expressions between groups of samples.

My objective are:

  1. Identify DE genes and DE transcripts
  2. Eliminate some bias before doing eQTL and sQTL mapping

About my data: 80 libraries in each of the two groups, ~30M reads in single end

Currently, I directly use the output from RSEM and pipe it to Voom to correct for known batch effects between samples (mainly flowcells effects).

Could you please point me to a better direction than this ?
Should I apply tximport before ?
Would you method, Alpine, work in my case (can it work with single end)?

Thank you for your help,
Regards,

tximport alpine • 527 views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by yohann.nedelec0
Answer: Bias correction in single end experiment for DEG and DET
1
gravatar for Michael Love
2.9 years ago by
Michael Love23k
United States
Michael Love23k wrote:

hi Yohann,

For GC content bias on the gene level, you can use the Bioconductor packages cqn or EDASeq and then any of the downstream statistical packages (DESeq2, edgeR, limma, etc). I believe for both packages, you can obtain the offset matrix for statistical analysis (don't know if your eQTL pipelines can accept offsets, but this is a simple thing for a linear model to accommodate), or you can get a normalized bias-corrected matrix for EDA.

I believe you could also use cqn and EDASeq with estimated transcript counts.

Now, your RSEM to limma-voom pipeline may be perfectly fine as is and you don't have to use the above tools, if it is the case that the GC dependence is explained mostly by batch terms. You can figure this out by running cqn or EDASeq, making the GC dependence plot, and coloring lines by batch. If nearly all the variation is across batch and not within batch, then I wouldn't change your current pipeline.

You can use tximport, but this is really a convenience function for reading in transcript quantifications and summarizing to the gene level. RSEM does this itself already.

alpine doesn't support single end yet. I hope to spend more time expanding the features and adding more documentation later this year (and adding to Bioconductor).

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Michael Love23k
Answer: Bias correction in single end experiment for DEG and DET
0
gravatar for yohann.nedelec
2.9 years ago by
yohann.nedelec0 wrote:

Thanks a lot for your answer Michael,

About correcting for length and GC content biases at the transcript level, my understanding is that I first have to calculate the GC content and length of each transcript and feed that info to EDAseq.
Am I correct with this approach or are there some caveats that I'm missing ?

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by yohann.nedelec0

hi Yohann, 

(quick note about the site, you can add Comments/Replies to thread a conversation instead of Answers which are for answering the original posted question)

Yes you would calculate GC content and length and feed these to EDASeq or cqn. Pointers for doing this are: extractTranscriptSeqs in the GenomicFeatures package and sum(width(grl)) if you have a GRangesList of the exons per transcript. But if you have further package specific questions, you can make a new post and get the advice of the package authors by tagging the post.

ADD REPLYlink written 2.9 years ago by Michael Love23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 274 users visited in the last hour