DeSeq vs current version of Cuffdiff

0

Entering edit mode

Richard Friedman ★ 2.0k

@richard-friedman-513

Last seen 9.6 years ago

Dear Bioconductor list, Sometime ago Simon Anders explained the difference between DeSeq and Cuffdiff as follows: "If you have two samples, cuffdiff tests, for each transcript, whether there is evidence that the concentration of this transcript is not the same in the two samples. If you have two different experimental conditions, with replicates for each condition, DESeq tests, whether, for a given gene, the change in expression strength between the two conditions is large as compared to the variation within each replicate group." Current language on the Cuffdiff site suggests that the current version of that program tests for whether the change is significant compared to changes in each condition. http://cufflinks.cbcb.umd.edu/howitworks.html#hdif http://cufflinks.cbcb.umd.edu/howitworks.html#reps Can someone please comment on the relative merits of Cuffdiff and DeSeq. I ask here because our sequencing core delivers results based on Cuffdiff and I want to know if I should redo it using DeSeq,I would greatly appreciate any guidance in this matter. Thanks and best wishes, Rich ------------------------------------------------------------ Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet) Room 824 Irving Cancer Research Center Columbia University 1130 St. Nicholas Ave New York, NY 10032 (212)851-4765 (voice) friedman at cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ I am a Bayesian. When I see a multiple-choice question on a test and I don't know the answer I say "eeney-meaney-miney-moe". Rose Friedman, Age 14

Sequencing Bayesian Cancer DESeq Sequencing Bayesian Cancer DESeq • 6.9k views

ADD COMMENT • link updated 12.2 years ago by Stephen Turner ▴ 290 • written 12.2 years ago by Richard Friedman ★ 2.0k

0

Entering edit mode

Tim Triche ★ 4.2k

@tim-triche-3561

Last seen 3.6 years ago

United States

Not directly relevant to gene-level RNA-seq DE calls, but rather for exon-level DE, I found it useful to read this: http://precedings.nature.com/documents/6837/version/1 In particular, section 4.3 on page 11, and supplementary figures S7 and S8 on page 19. I was informed by a coworker that since everyone uses BowTie-TopHat-Cufflinks-Cuffdiff, it is the sensible thing to do. Conversations with people who know what they are doing (Terry Speed & BCGSC) suggest the matter is not yet settled. So I retrieved ~1TB of BAMs, extracted the reads, and started looking into how that compares to DEXSeq and/or subread. It would be incredibly informative if the Cufflinks and DEXSeq authors had time to weigh in on their strengths/weaknesses. DEXSeq & cummeRbund both offer nice tools for exploring the results; I am curious which pipeline fits best for my needs. Thanks for bringing this up. On Mon, Feb 13, 2012 at 8:01 AM, Richard Friedman < friedman@cancercenter.columbia.edu> wrote: > Dear Bioconductor list, > > Sometime ago Simon Anders explained the difference > between DeSeq and Cuffdiff as follows: > > "If you have two samples, cuffdiff tests, for each transcript, whether > there is evidence that the concentration of this transcript is not the > same in the two samples. > > If you have two different experimental conditions, with replicates for > each condition, DESeq tests, whether, for a given gene, the change in > expression strength between the two conditions is large as compared to > the variation within each replicate group." > > Current language on the Cuffdiff site suggests that the current version > of that program tests for whether the change is significant compared to > changes in each condition. > > http://cufflinks.cbcb.umd.edu/**howitworks.html#hdif<http: cufflink="" s.cbcb.umd.edu="" howitworks.html#hdif=""> > > http://cufflinks.cbcb.umd.edu/**howitworks.html#reps<http: cufflink="" s.cbcb.umd.edu="" howitworks.html#reps=""> > > Can someone please comment on the relative merits of Cuffdiff and > DeSeq. I ask here because our sequencing core delivers results > based on Cuffdiff and I want to know if I should redo it using > DeSeq,I would greatly appreciate any guidance in this matter. > > Thanks and best wishes, > Rich > ------------------------------**------------------------------ > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet) > Room 824 > Irving Cancer Research Center > Columbia University > 1130 St. Nicholas Ave > New York, NY 10032 > (212)851-4765 (voice) > friedman@cancercenter.**columbia.edu <friedman@cancercenter.columbia.edu> > http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.co="" lumbia.edu="" ~friedman=""/> > > I am a Bayesian. When I see a multiple-choice question on a test and I > don't > know the answer I say "eeney-meaney-miney-moe". > > Rose Friedman, Age 14 > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]

ADD COMMENT • link 12.2 years ago Tim Triche ★ 4.2k

0

Entering edit mode

Stephen Turner ▴ 290

@stephen-turner-4916

Last seen 5.7 years ago

United States

I'm also in the same boat as Rich. I run a new bioinformatics core here and I'm building a pipeline for RNA-seq. Cufflinks for some time has supported biological replicates, and I'm also curious about the relative merits of using bowtie/tophat-cufflinks-cuffmerge-cuffdiff-cummeRbund versus using tophat-HTSeq?-customScriptForCreatingMatrix?-DESeq. Cufflinks also gives me a host of other tests (differential splicing load, differential TSS usage, differential coding output, etc), which also seem useful for certain applications. On a related note, does anyone have a workflow for taking multiple bam files, running HTSeq-count (or another program), plus some other program or custom script, to produce a matrix of counts as input to DESeq? Stephen ----------------------------------------- Stephen D. Turner, Ph.D. bioinformatics at virginia.edu Bioinformatics Core Director University of Virginia School of Medicine bioinformatics.virginia.edu On Tue, Feb 14, 2012 at 6:00 AM, <bioconductor-request at="" r-project.org=""> wrote: > > Message: 6 > Date: Mon, 13 Feb 2012 09:28:10 -0800 > From: "Tim Triche, Jr." <tim.triche at="" gmail.com=""> > To: Richard Friedman <friedman at="" cancercenter.columbia.edu=""> > Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> > Subject: Re: [BioC] DeSeq vs current version of Cuffdiff > Message-ID: > ? ? ? ?<cac+n9bwr30evfz5rh7hpnf-k9uq=b4n_sqgr0b49xgmw89jbmg at="" mail.gmail.com=""> > Content-Type: text/plain > > Not directly relevant to gene-level RNA-seq DE calls, but rather for > exon-level DE, > I found it useful to read this: > http://precedings.nature.com/documents/6837/version/1 > In particular, section 4.3 on page 11, and supplementary figures S7 and S8 > on page 19. > > I was informed by a coworker that since everyone uses > BowTie-TopHat-Cufflinks-Cuffdiff, it is the sensible thing to do. > Conversations with people who know what they are doing (Terry Speed & > BCGSC) suggest the matter is not yet settled. > So I retrieved ~1TB of BAMs, extracted the reads, and started looking into > how that compares to DEXSeq and/or subread. > > It would be incredibly informative if the Cufflinks and DEXSeq authors had > time to weigh in on their strengths/weaknesses. DEXSeq & cummeRbund both > offer nice tools for exploring the results; I am curious which pipeline > fits best for my needs. > > Thanks for bringing this up. > > > On Mon, Feb 13, 2012 at 8:01 AM, Richard Friedman < > friedman at cancercenter.columbia.edu> wrote: > > > Dear Bioconductor list, > > > > ? ? ? ?Sometime ago Simon Anders explained the difference > > between DeSeq and Cuffdiff as follows: > > > > "If you have two samples, cuffdiff tests, for each transcript, whether > > there is evidence that the concentration of this transcript is not the > > same in the two samples. > > > > If you have two different experimental conditions, with replicates for > > each condition, DESeq tests, whether, for a given gene, the change in > > expression strength between the two conditions is large as compared to > > the variation within each replicate group." > > > > Current language on the Cuffdiff site suggests that the current version > > of that program ?tests for whether the change is significant compared to > > changes in each condition. > > > > http://cufflinks.cbcb.umd.edu/**howitworks.html#hdif<http: cuffli="" nks.cbcb.umd.edu="" howitworks.html#hdif=""> > > > > http://cufflinks.cbcb.umd.edu/**howitworks.html#reps<http: cuffli="" nks.cbcb.umd.edu="" howitworks.html#reps=""> > > > > Can someone please comment on the relative merits of Cuffdiff and > > DeSeq. I ask here because our sequencing core delivers results > > based on Cuffdiff and I want to know if I should redo it using > > DeSeq,I would greatly appreciate any guidance in this matter. > > > > Thanks and best wishes, > > Rich > > ------------------------------**------------------------------ > > Richard A. Friedman, PhD > > Associate Research Scientist, > > Biomedical Informatics Shared Resource > > Herbert Irving Comprehensive Cancer Center (HICCC) > > Lecturer, > > Department of Biomedical Informatics (DBMI) > > Educational Coordinator, > > Center for Computational Biology and Bioinformatics (C2B2)/ > > National Center for Multiscale Analysis of Genomic Networks (MAGNet) > > Room 824 > > Irving Cancer Research Center > > Columbia University > > 1130 St. Nicholas Ave > > New York, NY 10032 > > (212)851-4765 (voice) > > friedman at cancercenter.**columbia.edu <friedman at="" cancercenter.columbia.edu=""> > > http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.="" columbia.edu="" ~friedman=""/> > > > > I am a Bayesian. When I see a multiple-choice question on a test and I > > don't > > know the answer I say "eeney-meaney-miney-moe". > > > > Rose Friedman, Age 14 > >

ADD COMMENT • link 12.2 years ago Stephen Turner ▴ 290

0

Entering edit mode

Dear Stephen, To your related note, you could have a look at the easyRNASeq package (bioC 2.10) for R (2.15). It reads in your annotation, your bam files and generate a count table for DESeq, all in R. It can actually do the first step of DESeq (estimating size library and dispersion) and give you back a normalized countDataSet object plus some validation plots as described in the DESeq vignettes (the same is true for edgeR). I'm about to push some changes in SVN to correct an issue that prevented the package vignette to be build. The package should be available in a couple of days as binary or you could install it directly from SVN. See http://wiki.fhcrc.org/bioc/SvnHowTo. The package URL is: https://h edgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/easyRNASeq Getting the proper set of annotation is definitely the most important step in the whole process and the one that requires most attention, the rest is then pretty straightforward. What I mean by annotation is the description of your feature of interest, be it gene, transcript, exon, enhancers, etc... as genomic loci (chr, start, width, etc...). The main issue in defining the annotation is to avoid counting reads multiple times and the kind of annotation needed does of course depends on your project. If you are interested in looking at isoforms differential expression, you probably want to define synthetic exons (to avoid double counting) and process the obtained count table with DEXseq. If you're looking at gene expression, you would want to create gene models to avoid multiple counting and use these to create your count table. If you are interested in eRNAs, you can define enhancer loci as the count "feature". All this can be done relatively easily in R. Once you have the proper annotation that suits your need, running easyRNASeq is very straightforward. easyRNASeq accepts both RangedData and GRangesList as annotation input, among other formats. easyRNASeq is in addition able to fetch annotations for you from different sources, but most of the time these would need to be post-processed. You can look at my post: "using easyRNASeq examples" from 2 days ago for some examples and comments in addition to the vignette content. Cheers, Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On 14 Feb 2012, at 15:02, Stephen Turner wrote: > I'm also in the same boat as Rich. I run a new bioinformatics core > here and I'm building a pipeline for RNA-seq. Cufflinks for some time > has supported biological replicates, and I'm also curious about the > relative merits of using > bowtie/tophat-cufflinks-cuffmerge-cuffdiff-cummeRbund versus using > tophat-HTSeq?-customScriptForCreatingMatrix?-DESeq. Cufflinks also > gives me a host of other tests (differential splicing load, > differential TSS usage, differential coding output, etc), which also > seem useful for certain applications. > > On a related note, does anyone have a workflow for taking multiple bam > files, running HTSeq-count (or another program), plus some other > program or custom script, to produce a matrix of counts as input to > DESeq? > > Stephen > > ----------------------------------------- > Stephen D. Turner, Ph.D. > bioinformatics at virginia.edu > Bioinformatics Core Director > University of Virginia School of Medicine > bioinformatics.virginia.edu > > On Tue, Feb 14, 2012 at 6:00 AM, <bioconductor-request at="" r-project.org=""> wrote: >> >> Message: 6 >> Date: Mon, 13 Feb 2012 09:28:10 -0800 >> From: "Tim Triche, Jr." <tim.triche at="" gmail.com=""> >> To: Richard Friedman <friedman at="" cancercenter.columbia.edu=""> >> Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> >> Subject: Re: [BioC] DeSeq vs current version of Cuffdiff >> Message-ID: >> <cac+n9bwr30evfz5rh7hpnf-k9uq=b4n_sqgr0b49xgmw89jbmg at="" mail.gmail.com=""> >> Content-Type: text/plain >> >> Not directly relevant to gene-level RNA-seq DE calls, but rather for >> exon-level DE, >> I found it useful to read this: >> http://precedings.nature.com/documents/6837/version/1 >> In particular, section 4.3 on page 11, and supplementary figures S7 and S8 >> on page 19. >> >> I was informed by a coworker that since everyone uses >> BowTie-TopHat-Cufflinks-Cuffdiff, it is the sensible thing to do. >> Conversations with people who know what they are doing (Terry Speed & >> BCGSC) suggest the matter is not yet settled. >> So I retrieved ~1TB of BAMs, extracted the reads, and started looking into >> how that compares to DEXSeq and/or subread. >> >> It would be incredibly informative if the Cufflinks and DEXSeq authors had >> time to weigh in on their strengths/weaknesses. DEXSeq & cummeRbund both >> offer nice tools for exploring the results; I am curious which pipeline >> fits best for my needs. >> >> Thanks for bringing this up. >> >> >> On Mon, Feb 13, 2012 at 8:01 AM, Richard Friedman < >> friedman at cancercenter.columbia.edu> wrote: >> >>> Dear Bioconductor list, >>> >>> Sometime ago Simon Anders explained the difference >>> between DeSeq and Cuffdiff as follows: >>> >>> "If you have two samples, cuffdiff tests, for each transcript, whether >>> there is evidence that the concentration of this transcript is not the >>> same in the two samples. >>> >>> If you have two different experimental conditions, with replicates for >>> each condition, DESeq tests, whether, for a given gene, the change in >>> expression strength between the two conditions is large as compared to >>> the variation within each replicate group." >>> >>> Current language on the Cuffdiff site suggests that the current version >>> of that program tests for whether the change is significant compared to >>> changes in each condition. >>> >>> http://cufflinks.cbcb.umd.edu/**howitworks.html#hdif<http: cuffli="" nks.cbcb.umd.edu="" howitworks.html#hdif=""> >>> >>> http://cufflinks.cbcb.umd.edu/**howitworks.html#reps<http: cuffli="" nks.cbcb.umd.edu="" howitworks.html#reps=""> >>> >>> Can someone please comment on the relative merits of Cuffdiff and >>> DeSeq. I ask here because our sequencing core delivers results >>> based on Cuffdiff and I want to know if I should redo it using >>> DeSeq,I would greatly appreciate any guidance in this matter. >>> >>> Thanks and best wishes, >>> Rich >>> ------------------------------**------------------------------ >>> Richard A. Friedman, PhD >>> Associate Research Scientist, >>> Biomedical Informatics Shared Resource >>> Herbert Irving Comprehensive Cancer Center (HICCC) >>> Lecturer, >>> Department of Biomedical Informatics (DBMI) >>> Educational Coordinator, >>> Center for Computational Biology and Bioinformatics (C2B2)/ >>> National Center for Multiscale Analysis of Genomic Networks (MAGNet) >>> Room 824 >>> Irving Cancer Research Center >>> Columbia University >>> 1130 St. Nicholas Ave >>> New York, NY 10032 >>> (212)851-4765 (voice) >>> friedman at cancercenter.**columbia.edu <friedman at="" cancercenter.columbia.edu=""> >>> http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.="" columbia.edu="" ~friedman=""/> >>> >>> I am a Bayesian. When I see a multiple-choice question on a test and I >>> don't >>> know the answer I say "eeney-meaney-miney-moe". >>> >>> Rose Friedman, Age 14 >>> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 12.2 years ago delhomme@embl.de ★ 1.2k

0

Entering edit mode

Nicholas, Simple question for you: where can I find Biobase 2.15.3? I've already installed genomeIntervals 1.11.0, but I can't seem to hunt this down (Mac). Thanks, Stephen > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C/en_US.UTF-8/C/C/C/C attached base packages: [1] parallel ?stats ? ? graphics ?grDevices utils ? ? datasets methods ? base other attached packages: [1] genomeIntervals_1.11.0 intervals_0.13.3 ? ? ? BiocInstaller_1.2.1 loaded via a namespace (and not attached): [1] Biobase_2.14.0 tools_2.14.0 On Tue, Feb 14, 2012 at 10:01 AM, Nicolas Delhomme <delhomme at="" embl.de=""> wrote: > > Dear ?Stephen, > > To your related note, you could have a look at the easyRNASeq package (bioC 2.10) for R (2.15). It reads in your annotation, your bam files and generate a count table for DESeq, all in R. It can actually do the first step of DESeq (estimating size library and dispersion) and give you back a normalized countDataSet object plus some validation plots as described in the DESeq vignettes (the same is true for edgeR). I'm about to push some changes in SVN to correct an issue that prevented the package vignette to be build. The package should be available in a couple of days as binary or you could install it directly from SVN. See http://wiki.fhcrc.org/bioc/SvnHowTo. The package URL is: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/R packs/easyRNASeq. > > Getting the proper set of annotation is definitely the most important step in the whole process and the one that requires most attention, the rest is then pretty straightforward. What I mean by annotation is the description of your feature of interest, be it gene, transcript, exon, enhancers, etc... as genomic loci (chr, start, width, etc...). > The main issue in defining the annotation is to avoid counting reads multiple times and the kind of annotation needed does of course depends on your project. If you are interested in looking at isoforms differential expression, you probably want to define synthetic exons (to avoid double counting) and process the obtained count table with DEXseq. If you're looking at gene expression, you would want to create gene models to avoid multiple counting and use these to create your count table. If you are interested in eRNAs, you can define enhancer loci as the count "feature". All this can be done relatively easily in R. Once you have the proper annotation that suits your need, running easyRNASeq is very straightforward. easyRNASeq accepts both RangedData and GRangesList as annotation input, among other formats. easyRNASeq is in addition able to fetch annotations for you from different sources, but most of the time these would need to be post-processed. You can look at my post: "using easyRNASeq examples" from 2 days ago for some examples and comments in addition to the vignette content. > > Cheers, > > Nico > > --------------------------------------------------------------- > Nicolas Delhomme > > Genome Biology Computational Support > > European Molecular Biology Laboratory > > Tel: +49 6221 387 8310 > Email: nicolas.delhomme at embl.de > Meyerhofstrasse 1 - Postfach 10.2209 > 69102 Heidelberg, Germany > --------------------------------------------------------------- > > > > > > On 14 Feb 2012, at 15:02, Stephen Turner wrote: > > > I'm also in the same boat as Rich. I run a new bioinformatics core > > here and I'm building a pipeline for RNA-seq. Cufflinks for some time > > has supported biological replicates, and I'm also curious about the > > relative merits of using > > bowtie/tophat-cufflinks-cuffmerge-cuffdiff-cummeRbund versus using > > tophat-HTSeq?-customScriptForCreatingMatrix?-DESeq. Cufflinks also > > gives me a host of other tests (differential splicing load, > > differential TSS usage, differential coding output, etc), which also > > seem useful for certain applications. > > > > On a related note, does anyone have a workflow for taking multiple bam > > files, running HTSeq-count (or another program), plus some other > > program or custom script, to produce a matrix of counts as input to > > DESeq? > > > > Stephen > > > > ----------------------------------------- > > Stephen D. Turner, Ph.D. > > bioinformatics at virginia.edu > > Bioinformatics Core Director > > University of Virginia School of Medicine > > bioinformatics.virginia.edu > > > > On Tue, Feb 14, 2012 at 6:00 AM, <bioconductor-request at="" r-project.org=""> wrote: > >> > >> Message: 6 > >> Date: Mon, 13 Feb 2012 09:28:10 -0800 > >> From: "Tim Triche, Jr." <tim.triche at="" gmail.com=""> > >> To: Richard Friedman <friedman at="" cancercenter.columbia.edu=""> > >> Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> > >> Subject: Re: [BioC] DeSeq vs current version of Cuffdiff > >> Message-ID: > >> ? ? ? ?<cac+n9bwr30evfz5rh7hpnf-k9uq=b4n_sqgr0b49xgmw89jbmg at="" mail.gmail.com=""> > >> Content-Type: text/plain > >> > >> Not directly relevant to gene-level RNA-seq DE calls, but rather for > >> exon-level DE, > >> I found it useful to read this: > >> http://precedings.nature.com/documents/6837/version/1 > >> In particular, section 4.3 on page 11, and supplementary figures S7 and S8 > >> on page 19. > >> > >> I was informed by a coworker that since everyone uses > >> BowTie-TopHat-Cufflinks-Cuffdiff, it is the sensible thing to do. > >> Conversations with people who know what they are doing (Terry Speed & > >> BCGSC) suggest the matter is not yet settled. > >> So I retrieved ~1TB of BAMs, extracted the reads, and started looking into > >> how that compares to DEXSeq and/or subread. > >> > >> It would be incredibly informative if the Cufflinks and DEXSeq authors had > >> time to weigh in on their strengths/weaknesses. DEXSeq & cummeRbund both > >> offer nice tools for exploring the results; I am curious which pipeline > >> fits best for my needs. > >> > >> Thanks for bringing this up. > >> > >> > >> On Mon, Feb 13, 2012 at 8:01 AM, Richard Friedman < > >> friedman at cancercenter.columbia.edu> wrote: > >> > >>> Dear Bioconductor list, > >>> > >>> ? ? ? ?Sometime ago Simon Anders explained the difference > >>> between DeSeq and Cuffdiff as follows: > >>> > >>> "If you have two samples, cuffdiff tests, for each transcript, whether > >>> there is evidence that the concentration of this transcript is not the > >>> same in the two samples. > >>> > >>> If you have two different experimental conditions, with replicates for > >>> each condition, DESeq tests, whether, for a given gene, the change in > >>> expression strength between the two conditions is large as compared to > >>> the variation within each replicate group." > >>> > >>> Current language on the Cuffdiff site suggests that the current version > >>> of that program ?tests for whether the change is significant compared to > >>> changes in each condition. > >>> > >>> http://cufflinks.cbcb.umd.edu/**howitworks.html#hdif<http: cuff="" links.cbcb.umd.edu="" howitworks.html#hdif=""> > >>> > >>> http://cufflinks.cbcb.umd.edu/**howitworks.html#reps<http: cuff="" links.cbcb.umd.edu="" howitworks.html#reps=""> > >>> > >>> Can someone please comment on the relative merits of Cuffdiff and > >>> DeSeq. I ask here because our sequencing core delivers results > >>> based on Cuffdiff and I want to know if I should redo it using > >>> DeSeq,I would greatly appreciate any guidance in this matter. > >>> > >>> Thanks and best wishes, > >>> Rich > >>> ------------------------------**------------------------------ > >>> Richard A. Friedman, PhD > >>> Associate Research Scientist, > >>> Biomedical Informatics Shared Resource > >>> Herbert Irving Comprehensive Cancer Center (HICCC) > >>> Lecturer, > >>> Department of Biomedical Informatics (DBMI) > >>> Educational Coordinator, > >>> Center for Computational Biology and Bioinformatics (C2B2)/ > >>> National Center for Multiscale Analysis of Genomic Networks (MAGNet) > >>> Room 824 > >>> Irving Cancer Research Center > >>> Columbia University > >>> 1130 St. Nicholas Ave > >>> New York, NY 10032 > >>> (212)851-4765 (voice) > >>> friedman at cancercenter.**columbia.edu <friedman at="" cancercenter.columbia.edu=""> > >>> http://cancercenter.columbia.**edu/~friedman/<http: cancercente="" r.columbia.edu="" ~friedman=""/> > >>> > >>> I am a Bayesian. When I see a multiple-choice question on a test and I > >>> don't > >>> know the answer I say "eeney-meaney-miney-moe". > >>> > >>> Rose Friedman, Age 14 > >>> > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 12.2 years ago Stephen Turner ▴ 290

0

Entering edit mode

Hi, On Fri, Feb 17, 2012 at 4:30 PM, Stephen Turner <vustephen at="" gmail.com=""> wrote: > Nicholas, > > Simple question for you: where can I find Biobase 2.15.3? I've already > installed genomeIntervals 1.11.0, but I can't seem to hunt this down > (Mac). You will need to be running the development version of R. Luckily for you, this is easy to install on a Mac via this package that Simon Urbanek kindly provides: http://r.research.att.com/R-devel-leopard.pkg After installation, run R again -- make sure the "welcome banner" for when you start R says something like: R Under development (unstable) (...) Copyright (C) 2011 The R Foundation for Statistical Computing ... Than install bioconductor packages as usual ... HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact

ADD REPLY • link 12.2 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

On 02/14/2012 06:02 AM, Stephen Turner wrote: > I'm also in the same boat as Rich. I run a new bioinformatics core > here and I'm building a pipeline for RNA-seq. Cufflinks for some time > has supported biological replicates, and I'm also curious about the > relative merits of using > bowtie/tophat-cufflinks-cuffmerge-cuffdiff-cummeRbund versus us(ing > tophat-HTSeq?-customScriptForCreatingMatrix?-DESeq. Cufflinks also The cuffdiff route provides analysis of two-group experiments, whereas a counts table provides the raw material for richer experimental designs, clustering / classification, exploratory analysis, methods development, etc. The counts table route is more challenging in the initial stages of de novo discovery. > gives me a host of other tests (differential splicing load, > differential TSS usage, differential coding output, etc), which also > seem useful for certain applications. > > On a related note, does anyone have a workflow for taking multiple bam > files, running HTSeq-count (or another program), plus some other > program or custom script, to produce a matrix of counts as input to > DESeq? In addition to Nico's suggestion, the work flow for 'counting' likely involves annotation input (via rtracklayer;:import, TxDb.* packages, GenomicFeatures::make*, biomaRt, etc); BAM input (via Rsamtools, GenomicRanges::readGappedAlignments, etc), and simple counting (via countOverlaps or GenomicRanges::summarizeExperiment). A simple(istic?) counter for unique hits to single end reads is library(GenomicRanges) ## read annotations anno into GRanges or GRangesList ## point to BAM files as (named) character vector counter <- function(file, anno) { ga <- readGappedAlignments(file) hits <- countOverlaps(ga, anno) countOverlaps(anno, ga[hits==1]) } counts <- sapply(files, counter, annotations) If rows of counts represent genes in a DESeq / edgeR analysis, then perhaps the 'hits' part of the counter can be omitted. Preceding this with library(parallel); sapply <- function(...) simplify2array(parallel::mclapply(...)) makes this run in parallel (though memory consumption might become an issue). Martin > > Stephen > > ----------------------------------------- > Stephen D. Turner, Ph.D. > bioinformatics at virginia.edu > Bioinformatics Core Director > University of Virginia School of Medicine > bioinformatics.virginia.edu > > On Tue, Feb 14, 2012 at 6:00 AM,<bioconductor-request at="" r-project.org=""> wrote: >> >> Message: 6 >> Date: Mon, 13 Feb 2012 09:28:10 -0800 >> From: "Tim Triche, Jr."<tim.triche at="" gmail.com=""> >> To: Richard Friedman<friedman at="" cancercenter.columbia.edu=""> >> Cc: Bioconductor mailing list<bioconductor at="" r-project.org=""> >> Subject: Re: [BioC] DeSeq vs current version of Cuffdiff >> Message-ID: >> <cac+n9bwr30evfz5rh7hpnf-k9uq=b4n_sqgr0b49xgmw89jbmg at="" mail.gmail.com=""> >> Content-Type: text/plain >> >> Not directly relevant to gene-level RNA-seq DE calls, but rather for >> exon-level DE, >> I found it useful to read this: >> http://precedings.nature.com/documents/6837/version/1 >> In particular, section 4.3 on page 11, and supplementary figures S7 and S8 >> on page 19. >> >> I was informed by a coworker that since everyone uses >> BowTie-TopHat-Cufflinks-Cuffdiff, it is the sensible thing to do. >> Conversations with people who know what they are doing (Terry Speed& >> BCGSC) suggest the matter is not yet settled. >> So I retrieved ~1TB of BAMs, extracted the reads, and started looking into >> how that compares to DEXSeq and/or subread. >> >> It would be incredibly informative if the Cufflinks and DEXSeq authors had >> time to weigh in on their strengths/weaknesses. DEXSeq& cummeRbund both >> offer nice tools for exploring the results; I am curious which pipeline >> fits best for my needs. >> >> Thanks for bringing this up. >> >> >> On Mon, Feb 13, 2012 at 8:01 AM, Richard Friedman< >> friedman at cancercenter.columbia.edu> wrote: >> >>> Dear Bioconductor list, >>> >>> Sometime ago Simon Anders explained the difference >>> between DeSeq and Cuffdiff as follows: >>> >>> "If you have two samples, cuffdiff tests, for each transcript, whether >>> there is evidence that the concentration of this transcript is not the >>> same in the two samples. >>> >>> If you have two different experimental conditions, with replicates for >>> each condition, DESeq tests, whether, for a given gene, the change in >>> expression strength between the two conditions is large as compared to >>> the variation within each replicate group." >>> >>> Current language on the Cuffdiff site suggests that the current version >>> of that program tests for whether the change is significant compared to >>> changes in each condition. >>> >>> http://cufflinks.cbcb.umd.edu/**howitworks.html#hdif<http: cuffli="" nks.cbcb.umd.edu="" howitworks.html#hdif=""> >>> >>> http://cufflinks.cbcb.umd.edu/**howitworks.html#reps<http: cuffli="" nks.cbcb.umd.edu="" howitworks.html#reps=""> >>> >>> Can someone please comment on the relative merits of Cuffdiff and >>> DeSeq. I ask here because our sequencing core delivers results >>> based on Cuffdiff and I want to know if I should redo it using >>> DeSeq,I would greatly appreciate any guidance in this matter. >>> >>> Thanks and best wishes, >>> Rich >>> ------------------------------**------------------------------ >>> Richard A. Friedman, PhD >>> Associate Research Scientist, >>> Biomedical Informatics Shared Resource >>> Herbert Irving Comprehensive Cancer Center (HICCC) >>> Lecturer, >>> Department of Biomedical Informatics (DBMI) >>> Educational Coordinator, >>> Center for Computational Biology and Bioinformatics (C2B2)/ >>> National Center for Multiscale Analysis of Genomic Networks (MAGNet) >>> Room 824 >>> Irving Cancer Research Center >>> Columbia University >>> 1130 St. Nicholas Ave >>> New York, NY 10032 >>> (212)851-4765 (voice) >>> friedman at cancercenter.**columbia.edu<friedman at="" cancercenter.columbia.edu=""> >>> http://cancercenter.columbia.**edu/~friedman/<http: cancercenter.="" columbia.edu="" ~friedman=""/> >>> >>> I am a Bayesian. When I see a multiple-choice question on a test and I >>> don't >>> know the answer I say "eeney-meaney-miney-moe". >>> >>> Rose Friedman, Age 14 >>> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD REPLY • link 12.2 years ago Martin Morgan 25k

Login before adding your answer.