Hello everyone!
I'm using edgeR to detect DE genes on my data. I have 2 control
samples and
4 mutated samples.
I understand why I should filter them and I know the command to use
(the
tutorials Drs. Mark Robinson, Davs McCarthy, Yunshun Chen and Gordon
K.Smyth made are pretty self explanatory in everything). What I don't
understand however is the filtering criteria.
I named my DGE object as "d", so the command I'm using is:
d <- d[rowSums(1e+06 * d$counts/expandAsMatrix(d$samples$lib.size,
dim(d))
> 1) >= ?, ]
Meaning that I'm filtering out genes that don't have at least one
count per
million on "?" samples. What value should I use for "?" given that I
have 2
control and 4 mutated samples.
Thank you in advance for your help!
C
[[alternative HTML version deleted]]
Hi Catarina,
Comments in line:
On Mon, Jul 22, 2013 at 8:07 AM, Catarina Almeida <catarina.fa at="" gmail.com=""> wrote:
> Hello everyone!
>
> I'm using edgeR to detect DE genes on my data. I have 2 control
samples and
> 4 mutated samples.
> I understand why I should filter them and I know the command to use
(the
> tutorials Drs. Mark Robinson, Davs McCarthy, Yunshun Chen and Gordon
> K.Smyth made are pretty self explanatory in everything). What I
don't
> understand however is the filtering criteria.
>
> I named my DGE object as "d", so the command I'm using is:
> d <- d[rowSums(1e+06 * d$counts/expandAsMatrix(d$samples$lib.size,
dim(d))
>> 1) >= ?, ]
Perhaps you'd like to simplify that to the more intuitive:
d <- d[rowSums(cpm(d) >= 1) >= ?, ]
> Meaning that I'm filtering out genes that don't have at least one
count per
> million on "?" samples. What value should I use for "?" given that I
have 2
> control and 4 mutated samples.
I believe the rule of thumb (if there is one) with this strategy would
be to use the number that is the minimum of the number of samples for
the conditions you have replicates in, so since you have one condition
with 2 replicates and another with 4, you'd pick 2.
HTH,
-steve
--
Steve Lianoglou
Computational Biologist
Bioinformatics and Computational Biology
Genentech