Is normalization in edgeR required for small RNA sequencing data?

0

Entering edit mode

Danie ▴ 40

@danie-5511

Last seen 9.8 years ago

Dear All, I am PhD student, currently working on differential expression analysis of my smallRNA library deep sequencing data and trying to identify differentially expressed miRNAs, using edgeR package. I have 24 different samples with 2 biological replicates (48 libraries). I am performing multiple group comparison using GLM method and also Anova-like test to idetify DE miRNAs among the different groups of my samples. My question is : Do I need to normalize my input data using *calcNormFactors() *once I set my DGE list or I could proceed without any normalization? I assume in this case that edgeR performs a default normallization when it is "calculating library sizes from column totals"? I would really appreciate any suggestion on this! Thanks in advance, Daniela [[alternative HTML version deleted]]

Sequencing SmallRNA edgeR Sequencing SmallRNA edgeR • 2.4k views

ADD COMMENT • link updated 13.3 years ago by Mark Robinson ▴ 880 • written 13.3 years ago by Danie ▴ 40

0

Entering edit mode

Mark Robinson ▴ 880

@mark-robinson-4908

Last seen 7.2 years ago

Hi Daniela, > Do I need to normalize my input data using *calcNormFactors() *once I set > my DGE list or I could proceed without any normalization? I assume in this > case that edgeR performs a default normallization when it is "calculating > library sizes from column totals"? Yes, by default edgeR will use column totals to "normalize". You don't strictly *need* to do additional normalization -- e.g. by calling calcNormFactors() -- but generally it does no harm and it often helps. That is, if there are no additional biases (beyond library size) to correct for, these additional correction factors will be near 1 anyways. As a trivial (uninteresting) example: > y <- matrix( rnbinom(300, mu=5, size=2), nrow=150 ) > d <- DGEList(y) Calculating library sizes from column totals. > d$samples group lib.size norm.factors Sample1 1 720 1 Sample2 1 635 1 > d <- calcNormFactors(d) > d$samples group lib.size norm.factors Sample1 1 720 0.9663861 Sample2 1 635 1.0347831 Of course, it doesn't hurt to look through a few MA-style plots for your data to see that your samples are comparable and that normalization is operating well. Best, Mark ---------- Prof. Dr. Mark Robinson Bioinformatics Institute of Molecular Life Sciences University of Zurich Winterthurerstrasse 190 8057 Zurich Switzerland v: +41 44 635 4848 f: +41 44 635 6898 e: mark.robinson at imls.uzh.ch o: Y11-J-16 w: http://tiny.cc/mrobin ---------- http://www.fgcz.ch/Bioconductor2012 On 22.09.2012, at 00:23, Daniela Lopes Paim Pinto wrote: > Dear All, > > I am PhD student, currently working on differential expression analysis of > my smallRNA library deep sequencing data and trying to identify > differentially expressed miRNAs, using edgeR package. I have 24 different > samples with 2 biological replicates (48 libraries). I am performing > multiple group comparison using GLM method and also Anova-like test to > idetify DE miRNAs among the different groups of my samples. > My question is : > > Do I need to normalize my input data using *calcNormFactors() *once I set > my DGE list or I could proceed without any normalization? I assume in this > case that edgeR performs a default normallization when it is "calculating > library sizes from column totals"? > > > I would really appreciate any suggestion on this! > > > Thanks in advance, > > > Daniela > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.3 years ago Mark Robinson ▴ 880

Login before adding your answer.