RNA-Seq, generate batch-free count matrix
2
0
Entering edit mode
@steve-lianoglou-2771
Last seen 21 months ago
United States
Sorry to resurrect a somehow old thread, but I wanted to kick the tires on this and wanted to clarify the google-record for posterity ... so: On Sun, Jul 20, 2014 at 5:18 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > > On Mon, 21 Jul 2014, Gordon K Smyth wrote: > >> logCPMc <- removeBatchEffect(y, batch) > > > Should have written > > logCPMc <- removeBatchEffect(y2, batch) Shouldn't the correction actually be this? R> lobCPMc <- removeBatchEffect(logCPM, batch) Because in the code snippet `y2` is a DGEList dataset (of counts). So, removing batch effects from soup to nuts: ## ------------------------------------------------ library(edgeR) y <- DGEList(counts=counts) ## Filter non-expressed genes: A <- aveLogCPM(y) y2 <- y2[A>1,] ## Then normalize and compute log2 counts-per-million with an offset y2 <- calcNormFactors(y2) logCPM <- cpm(y2, log=TRUE, prior.count=5) ## Then remove batch correct: logCPMc <- removeBatchEffect(logCPM, batch) ## ------------------------------------------------ Thanks, -steve -- Steve Lianoglou Computational Biologist Genentech
• 1.7k views
ADD COMMENT
0
Entering edit mode
Brian Haas ▴ 90
@brian-haas-6657
Last seen 6.1 years ago
Hi Steve, I think that's right. I put an example of what I was experimenting with here: https://github.com/brianjohnhaas/singleCellRNASeq note, it turned out that I really shouldn't have been trying to remove the 'batch' effect here, because my two plates of cells turned out to be quite different from each other beyond any batch issue (true biological differences). In any case, I'm still looking to identify various code snippets that are available through bioconductor to remove nuisance variation from data, and will fill out the TBD sections when I find some time, and contributions are always welcome. best, ~brian On Fri, Aug 1, 2014 at 6:02 PM, Steve Lianoglou <lianoglou.steve@gene.com> wrote: > Sorry to resurrect a somehow old thread, but I wanted to kick the > tires on this and wanted to clarify the google-record for posterity > ... so: > > On Sun, Jul 20, 2014 at 5:18 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > > > > On Mon, 21 Jul 2014, Gordon K Smyth wrote: > > > >> logCPMc <- removeBatchEffect(y, batch) > > > > > > Should have written > > > > logCPMc <- removeBatchEffect(y2, batch) > > Shouldn't the correction actually be this? > > R> lobCPMc <- removeBatchEffect(logCPM, batch) > > Because in the code snippet `y2` is a DGEList dataset (of counts). > > So, removing batch effects from soup to nuts: > > ## ------------------------------------------------ > library(edgeR) > y <- DGEList(counts=counts) > > ## Filter non-expressed genes: > A <- aveLogCPM(y) > y2 <- y2[A>1,] > > ## Then normalize and compute log2 counts-per-million with an offset > y2 <- calcNormFactors(y2) > logCPM <- cpm(y2, log=TRUE, prior.count=5) > > ## Then remove batch correct: > logCPMc <- removeBatchEffect(logCPM, batch) > ## ------------------------------------------------ > > Thanks, > -steve > > -- > Steve Lianoglou > Computational Biologist > Genentech > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 13 hours ago
WEHI, Melbourne, Australia
Thanks Steve, that is what I had meant to write. Gordon On Fri, 1 Aug 2014, Steve Lianoglou wrote: > So, removing batch effects from soup to nuts: > > ## ------------------------------------------------ > library(edgeR) > y <- DGEList(counts=counts) > > ## Filter non-expressed genes: > A <- aveLogCPM(y) > y2 <- y2[A>1,] > > ## Then normalize and compute log2 counts-per-million with an offset > y2 <- calcNormFactors(y2) > logCPM <- cpm(y2, log=TRUE, prior.count=5) > > ## Then remove batch correct: > logCPMc <- removeBatchEffect(logCPM, batch) > ## ------------------------------------------------ > > Thanks, > -steve > > -- > Steve Lianoglou > Computational Biologist > Genentech > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 875 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6