Hi edgeR expert:
I got a problem in using edgeR for RNA Seq data analysis. The
manual recommends the use of getPriorN to get an appropriate estimate
for prior.n used in the algorithm and also warn that prior.n should
never be less than 1. Based on the recommended estimation formula
prior.n * df = 20 ~ 30 where df = #of samples - #of conditions, it
seems that the edgeR package will never work for sample size > 15 for
two conditions if 30 is used in the formula, and > 11 if 20 is used in
the formula.
Am I wrong somewhere in understanding the package? I want to make
thing sure before I use it because I got more than 100 samples per
group in my RNA-seq dataset and the estimate of prior.n << 1. Please
be clear that my case is of 100 SAMPLES (or LIBRARIES) NOT 100 GENES
or TAGS
I would appreciate any comments and helps from experts of edgeR
Best
Richard Hu
-- output of sessionInfo():
R version 2.14.2 (2012-02-29)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.14.2
--
Sent via the guest posting facility at bioconductor.org.
Dear Richard,
There's actually no problem in edgeR with prior.n less than one. The
remark in the User's Guide about keeping prior.n not less than one was
my
attempt to dissuade people from choosing prior.n too small when
analysing
small data sets. It should probably be removed.
Best wishes
Gordon
> Date: Mon, 2 Apr 2012 14:02:47 -0700 (PDT)
> From: "Richard Hu [guest]" <guest at="" bioconductor.org="">
> To: bioconductor at r-project.org, xuguang.hu at seattlebiomed.org
> Subject: Re: [BioC] edgeR for large sample size
>
> Hi edgeR expert:
> I got a problem in using edgeR for RNA Seq data analysis. The
manual
> recommends the use of getPriorN to get an appropriate estimate for
> prior.n used in the algorithm and also warn that prior.n should
never be
> less than 1. Based on the recommended estimation formula prior.n *
df =
> 20 ~ 30 where df = #of samples - #of conditions, it seems that the
edgeR
> package will never work for sample size > 15 for two conditions if
30 is
> used in the formula, and > 11 if 20 is used in the formula.
> Am I wrong somewhere in understanding the package? I want to make
> thing sure before I use it because I got more than 100 samples per
group
> in my RNA-seq dataset and the estimate of prior.n << 1. Please be
clear
> that my case is of 100 SAMPLES (or LIBRARIES) NOT 100 GENES or TAGS
> I would appreciate any comments and helps from experts of edgeR
>
> Best
> Richard Hu
>
>
> -- output of sessionInfo():
>
> R version 2.14.2 (2012-02-29)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.14.2
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}