Entering edit mode
Dear Allessandro,
I haven't seen the MDS plots (because attachments are not distributed
to
the list), but don't see anything surprising in what you have
reported.
If you compare one group (all C) vs only those members of the other
group
that are most different to it (1R+3R), naturally you will find lots of
DE
genes.
Best wishes
Gordon
> Date: Thu, 12 Jul 2012 10:48:01 +0200
> From: "alessandro.guffanti at genomnia.com"
> <alessandro.guffanti at="" genomnia.com="">
> To: Bioconductor mailing list <bioconductor at="" r-project.org="">
> Subject: Re: [BioC] edgeR and tagwise dispersion: overcorrection for
> multiple tests?
>
> Dear colleagues good morning - I am back to an old issue because I
am
> now much more
> certain of what I see - and I begin to wonder wether this is due to
> biology rather than
> to analytical tools or strategies ..
>
> => Here is my sessionInfo() to begin with:
>
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices datasets utils methods base
>
> other attached packages:
> [1] edgeR_2.6.7 limma_3.12.1 R.utils_1.12.1 R.oo_1.9.8
> [5] R.methodsS3_1.4.2
>
> => the experiment description: RNA from five samples and five
controls,
> mice,
> homogenesous stimulus, brain tissue, SAGE with SOLiD with a good
mapping
> in the UTR (checked also with genome-wide mapping). Tags have been
selected
> with the following parameters: only in UTR; unique mapping; only one
> mismatch;
> begin with CATG, hence quite stringent. Hence tha samples are tagged
{1
> to 5}R
> for ths stimulus, {1 to 5} as the control
>
> => MDS plot and simple pairwise regression analysis of the tag
counts
> between
> R,C,R vs R and C vs C reveals a clear division of the R samples in
two
> groups:
> {1R, 3R} and {2R,4R,5R}. In addition, one C sample (3C) overlaps
with
> two R samples
> and is removed from comparisons
>
> => three DEG calculations were performed:
> (A) all C vs all R;
> (B) all C minus 3 C vs 1R + 3R;
> (C) all C minus 3 C versus {2R,4R,5R}
>
> => tagwise dispersion; normalizatuion factor on the libraries
> calculated; filtering by minimal CPM in samples leaves between 6000
and
> 7000 genes for each comparison.
>
> => results which make me wonder about what is happening in the R
> (esperiment) samples:
>
> Comparison A (ALL vs ALL): TWO genes with significant FDR (BH
corrected
> PValue I understand)
> Comparison B (ALL-3C vs 1R,3R): 2099 genes with significant FDR (!)
> Comparison C (ALL-3C vs 2R,4R,5R): 20 genes with significant FDR
>
> Now, excuse my ignorance, but this is a rather strong effect of the
> subsetting of one of the two comparison datasets on the FDR, which I
did
> not found in many other similar analyses. In fact, when I first
mailed
> the list, I was talking about 'overcorrection for multiple tests'.
>
> Is there any reasonable explanation (apart from {1R,3R} and
{2R,4R,5R}
> being totally different samples, which I exclude) for this ? maybe a
> strong dependency between the genes involved in the response to the
> stimulus in the two R subgroups ?
>
> I include below the three MDS plots - thanks for any answer and
again
> excuse me, maybe there is a trivial reason for this (such as number
of
> samples..) but it is an unqiue situation between my many SAGE
> experiments analyzed with edgeR..
>
> Kind regards,
>
> Alessandro
>
> --
>
>
>
>
>
>
>
> --
>
> Alessandro Guffanti - Head, Bioinformatics, Genomnia srl
> Via Nerviano, 31 - 20020 Lainate, Milano, Italy
> Ph: +39-0293305.702 Fax: +39-0293305.777
> http://www.genomnia.com
> "When you're curious, you find lots of interesting things to do."
> (Walt Disney)
>
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}