Question

edgeR for large sample size

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.3 years ago

Hi edgeR expert: I got a problem in using edgeR for RNA Seq data analysis. The manual recommends the use of getPriorN to get an appropriate estimate for prior.n used in the algorithm and also warn that prior.n should never be less than 1. Based on the recommended estimation formula prior.n * df = 20 ~ 30 where df = #of samples - #of conditions, it seems that the edgeR package will never work for sample size > 15 for two conditions if 30 is used in the formula, and > 11 if 20 is used in the formula. Am I wrong somewhere in understanding the package? I want to make thing sure before I use it because I got more than 100 samples per group in my RNA-seq dataset and the estimate of prior.n << 1. Please be clear that my case is of 100 SAMPLES (or LIBRARIES) NOT 100 GENES or TAGS I would appreciate any comments and helps from experts of edgeR Best Richard Hu -- output of sessionInfo(): R version 2.14.2 (2012-02-29) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.14.2 -- Sent via the guest posting facility at bioconductor.org.

edgeR edgeR • 1.1k views

ADD COMMENT • link updated 13.7 years ago by Gordon Smyth 53k • written 13.8 years ago by Guest User ★ 13k

score 0 · Answer 1 · 2012-04-03

Dear Richard, There's actually no problem in edgeR with prior.n less than one. The remark in the User's Guide about keeping prior.n not less than one was my attempt to dissuade people from choosing prior.n too small when analysing small data sets. It should probably be removed. Best wishes Gordon > Date: Mon, 2 Apr 2012 14:02:47 -0700 (PDT) > From: "Richard Hu [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, xuguang.hu at seattlebiomed.org > Subject: Re: [BioC] edgeR for large sample size > > Hi edgeR expert: > I got a problem in using edgeR for RNA Seq data analysis. The manual > recommends the use of getPriorN to get an appropriate estimate for > prior.n used in the algorithm and also warn that prior.n should never be > less than 1. Based on the recommended estimation formula prior.n * df = > 20 ~ 30 where df = #of samples - #of conditions, it seems that the edgeR > package will never work for sample size > 15 for two conditions if 30 is > used in the formula, and > 11 if 20 is used in the formula. > Am I wrong somewhere in understanding the package? I want to make > thing sure before I use it because I got more than 100 samples per group > in my RNA-seq dataset and the estimate of prior.n << 1. Please be clear > that my case is of 100 SAMPLES (or LIBRARIES) NOT 100 GENES or TAGS > I would appreciate any comments and helps from experts of edgeR > > Best > Richard Hu > > > -- output of sessionInfo(): > > R version 2.14.2 (2012-02-29) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.14.2 > > -- > Sent via the guest posting facility at bioconductor.org. > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}