Dear all,
first of all, sorry if this is a cross post (previously posted on tuxedo-tools-users, seqanswers and ResearchGate)...but no solution has popped up so far....
I'm wondering whether anyone has already implemented a filter based on repFpkm in cummeRbund, so to select all those genes for which the Fpkm is above of a certain threshold for ALL the replicates belonging to a condition OR to another one.
I mean that, for example, a gene must have in control at least fpkm = 1, for all replicates #1, #2 and #3, even if it doesn't happen for the treatment; and vice versa.
I'm using:
gene.diff<-diffData(genes(cuff))
gene.diff.q1.filtered <- gene.diff[gene.diff$value_1 > 1,]
gene.diff.q2.filtered <- gene.diff[gene.diff$value_2 > 1,]
but it selects only genes having 'global' Fpkm above 1 (takes all reps together).
I need to get rid of cases like this (it's not particularly significant but it works for explanation):
>gene.diff.filtered[gene.diff.filtered$gene_id == "comp13340",]
gene_id sample_1 sample_2 status value_1 value_2 log2_fold_change test_stat p_value q_value
2102 comp13340 q1 q2 OK 0 1.01699 Inf NA 5e-05 0.00135144
significant
2102 yes
> gene.repFpkm[gene.repFpkm$gene_id == "comp13340",]
gene_id sample_name replicate rep_name raw_frags internal_scaled_frags external_scaled_frags fpkm
2102 comp13340 q1 0 q1_0 0 0.00000 0.00000 0.00000
35771 comp13340 q1 1 q1_1 0 0.00000 0.00000 0.00000
69440 comp13340 q1 2 q1_2 0 0.00000 0.00000 0.00000
103109 comp13340 q2 0 q2_0 2 2.47685 2.47685 1.26962
136778 comp13340 q2 1 q2_1 0 0.00000 0.00000 0.00000
170447 comp13340 q2 2 q2_2 8 8.39216 8.39216 1.78133
effective_length status
2102 NA OK
35771 NA OK
69440 NA OK
103109 NA OK
136778 NA OK
170447 NA OK
it's clear that not all q2 reps have fpkm > 1 ..........
Thanks for any help or hint!
(and sorry for cross post)
stefano