Dear all,
I use edgeR for differential analysis of ChIP-seq densities.
I would like to know how to export the results from 'exactTest',
including p-values and FDR.
Using 'topTags', I only have the 10 first results, with p-values and
FDR.
But when I do 'write.table(mytest$table,
)', it gives me only the complete set of p-values, but not the FDR.
Thanks for your help,
Best regards,
helen
[[alternative HTML version deleted]]
Hi helen-
After calling exactTest, you can call topTags with the result. topTags
takes a parameter, n, which is the number of results to return -- if
you
set it to the total number of sites you are comparing, you will get
them
all back with their associated stats. I end up doing something like:
> results = topTags(de$db,nrow(db$counts))$table
to get everything back.
If you are using edgeR to do differential binding analysis of ChIP-seq
data, you may also want to check out the DiffBind package, which add
functions to make this sort of thing easier.
Cheers-
Rory
On 25/01/2012 15:14, "NEIL-BERNET Helen" <helen.neil-bernet at="" cea.fr="">
wrote:
>Dear all,
>
>I use edgeR for differential analysis of ChIP-seq densities.
>I would like to know how to export the results from 'exactTest',
>including p-values and FDR.
>Using 'topTags', I only have the 10 first results, with p-values and
FDR.
>But when I do 'write.table(mytest$table, ?)', it gives me only the
>complete set of p-values, but not the FDR.
>
>Thanks for your help,
>
>Best regards,
>
>helen
>
>
>
> [[alternative HTML version deleted]]
>
NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for
...{{dropped:16}}
Rory,
Thanks for your answer.
I've already tried DiffBind, it gave me roughly the same results as
edgeR,
hopefully!
Indeed, it's easier for ChIP-seq, but I couldn't address 2 things:
1) I need to use quantile normalization (with limma package) prior to
differential analysis, I'm not sure that's possible to use directly my
own
normalized counts. Moreover, I don't use peak calling, because I study
specific regions, for example promoter proximal regions, so I don't
have
any affinity score.
2) I couldn't find a way to analyze my data without replicates, and
that's
a critical point for ChIP-seq, most of the times, we don't have any
replicates!
If you can answer these questions, I would really appreciate to work
with
DiffBind instead of edgeR!!
Thank you very much,
helen
Le 25/01/12 16:33, << Rory Stark >> <rory.stark at="" cancer.org.uk=""> a
?crit :
>Hi helen-
>
>After calling exactTest, you can call topTags with the result.
topTags
>takes a parameter, n, which is the number of results to return -- if
you
>set it to the total number of sites you are comparing, you will get
them
>all back with their associated stats. I end up doing something like:
>
>> results = topTags(de$db,nrow(db$counts))$table
>
>
>to get everything back.
>
>If you are using edgeR to do differential binding analysis of ChIP-
seq
>data, you may also want to check out the DiffBind package, which add
>functions to make this sort of thing easier.
>
>Cheers-
>Rory
>
>On 25/01/2012 15:14, "NEIL-BERNET Helen" <helen.neil-bernet at="" cea.fr=""> wrote:
>
>>Dear all,
>>
>>I use edgeR for differential analysis of ChIP-seq densities.
>>I would like to know how to export the results from 'exactTest',
>>including p-values and FDR.
>>Using 'topTags', I only have the 10 first results, with p-values and
FDR.
>>But when I do 'write.table(mytest$table, ?)', it gives me only the
>>complete set of p-values, but not the FDR.
>>
>>Thanks for your help,
>>
>>Best regards,
>>
>>helen
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>
>
>NOTICE AND DISCLAIMER
>This e-mail (including any attachments) is intended for the above-
named
>person(s). If you are not the intended recipient, notify the sender
>immediately, delete this email from your system and do not disclose
or
>use for any purpose.
>
>We may monitor all incoming and outgoing emails in line with current
>legislation. We have taken steps to ensure that this email and
>attachments are free from any virus, but it remains your
responsibility
>to ensure that viruses do not adversely affect you.
>Cancer Research UK
>Registered in England and Wales
>Company Registered Number: 4325234.
>Registered Charity Number: 1089464 and Scotland SC041666
>Registered Office Address: Angel Building, 407 St John Street, London
>EC1V 4AD.
Hi helen-
Addressing first the use of pre-defined regions instead of peak
callers,
this is possible with DiffBind. The easiest way to to write out the
regions in a single file (chr, start, end, 1) and specify the same
file
for all the samples. Alternatively, if you are creating the region
file
dynamically, you can create a single "dummy" peak file and specify it
for
each sample, e.g.
chr1 1 5 1
chr1 10 15 1
Then, in the call to dba.counts, specify your dynamically computed set
of
regions as the "peaks" parameter, and DiffBind will use those regions.
We
do this all the time, e.g. get all the transcription start sites, then
take windows 1000bp upstream and 500bp downstream of each TSS, and
pass
that in to dba.count -- no peak calling needed.
The alternative normalization issue is more difficult. I assume you
are
quantile normalizing the read counts for all your regions prior to
sending
them to edgeR (by creating a DGEList). Do you do the TMM normalization
after that (calcNormFactors) or somehow fake out the normalization
factors? Or use another normalization method? edgeR includes an
"upperquartile" normalization option, is that not sufficient? I'd
really
like to understand what you do, as we are working on the next version
of
DiffBind and providing more flexibility in normalization is an area we
are
looking at.
Finally, replicates. While DiffBind does let you run with no
replicates
(it just gives you a nasty warning), you really, really, really need
replicates for this kind of analysis. Really. edgeR needs replicates
to
compute dispersion for the negative binomial -- without replicates it
just
sets the dispersion to zero. Even assuming that technical (sequencing)
effects are negligible, the biological variation and experimental
variance
in ChIP efficiency is such that you can have no confidence that any
differential binding you find is meaningful without replicates. ChIP,
thanks in part to its use of antibodies, has greater experimental
variation than RNA expression assays, and a differential expression
analysis without replicates would be laughed out of any peer-reviewed
journal.
We need to get away from it being acceptable to consider a comparative
ChIP-seq analysis without replicates -- even "just to see if there's
anything there". We need to encourage the lab scientists to price this
in;
here at CRI we basically refuse to do anything but the most simple
analysis on non-replicated ChIPs.
Cheers-
Rory
On 26/01/2012 10:44, "NEIL-BERNET Helen" <helen.neil-bernet at="" cea.fr="">
wrote:
>Rory,
>
>Thanks for your answer.
>I've already tried DiffBind, it gave me roughly the same results as
edgeR,
>hopefully!
>Indeed, it's easier for ChIP-seq, but I couldn't address 2 things:
>
>1) I need to use quantile normalization (with limma package) prior to
>differential analysis, I'm not sure that's possible to use directly
my own
>normalized counts. Moreover, I don't use peak calling, because I
study
>specific regions, for example promoter proximal regions, so I don't
have
>any affinity score.
>
>2) I couldn't find a way to analyze my data without replicates, and
that's
>a critical point for ChIP-seq, most of the times, we don't have any
>replicates!
>If you can answer these questions, I would really appreciate to work
with
>DiffBind instead of edgeR!!
>
>Thank you very much,
>
>helen
>
>
>
>Le 25/01/12 16:33, << Rory Stark >> <rory.stark at="" cancer.org.uk=""> a
?crit :
>
>>Hi helen-
>>
>>After calling exactTest, you can call topTags with the result.
topTags
>>takes a parameter, n, which is the number of results to return -- if
you
>>set it to the total number of sites you are comparing, you will get
them
>>all back with their associated stats. I end up doing something like:
>>
>>> results = topTags(de$db,nrow(db$counts))$table
>>
>>
>>to get everything back.
>>
>>If you are using edgeR to do differential binding analysis of ChIP-
seq
>>data, you may also want to check out the DiffBind package, which add
>>functions to make this sort of thing easier.
>>
>>Cheers-
>>Rory
>>
>>On 25/01/2012 15:14, "NEIL-BERNET Helen" <helen.neil-bernet at="" cea.fr="">
>>wrote:
>>
>>>Dear all,
>>>
>>>I use edgeR for differential analysis of ChIP-seq densities.
>>>I would like to know how to export the results from 'exactTest',
>>>including p-values and FDR.
>>>Using 'topTags', I only have the 10 first results, with p-values
and
>>>FDR.
>>>But when I do 'write.table(mytest$table, ?)', it gives me only the
>>>complete set of p-values, but not the FDR.
>>>
>>>Thanks for your help,
>>>
>>>Best regards,
>>>
>>>helen
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>
>>
>>NOTICE AND DISCLAIMER
>>This e-mail (including any attachments) is intended for the above-
named
>>person(s). If you are not the intended recipient, notify the sender
>>immediately, delete this email from your system and do not disclose
or
>>use for any purpose.
>>
>>We may monitor all incoming and outgoing emails in line with current
>>legislation. We have taken steps to ensure that this email and
>>attachments are free from any virus, but it remains your
responsibility
>>to ensure that viruses do not adversely affect you.
>>Cancer Research UK
>>Registered in England and Wales
>>Company Registered Number: 4325234.
>>Registered Charity Number: 1089464 and Scotland SC041666
>>Registered Office Address: Angel Building, 407 St John Street,
London
>>EC1V 4AD.
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at r-project.org
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for
...{{dropped:16}}
Hi Helen,
You can do this manually. Using the example from the exactTest()
docs:
-----
# generate raw counts from NB, create list object
y <- matrix(rnbinom(80,size=1/0.2,mu=10),nrow=20,ncol=4)
rownames(y) <- paste("Gene",1:nrow(y),sep=".")
group <- factor(c(1,1,2,2))
d <- DGEList(counts=y,group=group,lib.size=rep(1000,4))
# estimate dispersions and find differences in expression
d <- estimateCommonDisp(d)
d <- estimateTagwiseDisp(d)
de <- exactTest(d)
-----
Then, something like:
outtab <- cbind(de$table, FDR=p.adjust(de$table$p.value,method="BH"))
write.table(outtab, "my.txt" )
or:
tt <- topTags(de, n=Inf)
write.table(tt, "my.txt" )
The latter is sorted, the former isn't. Hope that helps.
Regards,
Mark
On 25.01.2012, at 16:14, NEIL-BERNET Helen wrote:
> Dear all,
>
> I use edgeR for differential analysis of ChIP-seq densities.
> I would like to know how to export the results from 'exactTest',
including p-values and FDR.
> Using 'topTags', I only have the 10 first results, with p-values and
FDR.
> But when I do 'write.table(mytest$table, ?)', it gives me only the
complete set of p-values, but not the FDR.
>
> Thanks for your help,
>
> Best regards,
>
> helen
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland
v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y11-J-16
w: http://tiny.cc/mrobin
Mark,
topTags(de, n=Inf) is perfect! I could obtain the full table.
Thanks a lot!
helen
Le 25/01/12 16:28, ? Mark Robinson ? <mark.robinson at="" imls.uzh.ch=""> a
?crit :
>Hi Helen,
>
>You can do this manually. Using the example from the exactTest()
docs:
>
>-----
> # generate raw counts from NB, create list object
> y <- matrix(rnbinom(80,size=1/0.2,mu=10),nrow=20,ncol=4)
> rownames(y) <- paste("Gene",1:nrow(y),sep=".")
> group <- factor(c(1,1,2,2))
> d <- DGEList(counts=y,group=group,lib.size=rep(1000,4))
>
> # estimate dispersions and find differences in expression
> d <- estimateCommonDisp(d)
> d <- estimateTagwiseDisp(d)
> de <- exactTest(d)
>-----
>
>Then, something like:
>
>outtab <- cbind(de$table, FDR=p.adjust(de$table$p.value,method="BH"))
>write.table(outtab, "my.txt" )
>
>or:
>
>tt <- topTags(de, n=Inf)
>write.table(tt, "my.txt" )
>
>The latter is sorted, the former isn't. Hope that helps.
>
>Regards,
>Mark
>
>
>On 25.01.2012, at 16:14, NEIL-BERNET Helen wrote:
>
>> Dear all,
>>
>> I use edgeR for differential analysis of ChIP-seq densities.
>> I would like to know how to export the results from 'exactTest',
>>including p-values and FDR.
>> Using 'topTags', I only have the 10 first results, with p-values
and
>>FDR.
>> But when I do 'write.table(mytest$table, ?)', it gives me only the
>>complete set of p-values, but not the FDR.
>>
>> Thanks for your help,
>>
>> Best regards,
>>
>> helen
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>----------
>Prof. Dr. Mark Robinson
>Bioinformatics
>Institute of Molecular Life Sciences
>University of Zurich
>Winterthurerstrasse 190
>8057 Zurich
>Switzerland
>
>v: +41 44 635 4848
>f: +41 44 635 6898
>e: mark.robinson at imls.uzh.ch
>o: Y11-J-16
>w: http://tiny.cc/mrobin
>