variation between cells compared to samples

0

Entering edit mode

Pete Shepard ▴ 240

@pete-shepard-3324

Last seen 9.6 years ago

Hi All, I am comparing four RNAseq experiments, exp # 1 and 2 are done using protocol A and experiment 3 and 4 are done using protocol B. Experiments 1 and 3 are done using stem cells and experiment 2 and 4 are done using neural cells. I would like to see if there is more variation between the two types of protocols compared to the two types of cells. I have used the DESEQ package to plot the squared coefficient of variation against the base mean but I am wondering if there is a single metric I can use to compare the variations? [[alternative HTML version deleted]]

RNASeq RNASeq • 1.2k views

ADD COMMENT • link updated 14.2 years ago by Wolfgang Huber ★ 13k • written 14.2 years ago by Pete Shepard ▴ 240

0

Entering edit mode

Steve Lianoglou ★ 13k

@steve-lianoglou-2771

Last seen 13 months ago

United States

Hi, On Fri, Feb 19, 2010 at 1:23 PM, Pete Shepard <peter.shepard at="" gmail.com=""> wrote: > Hi All, > > I am comparing four RNAseq experiments, exp # 1 and 2 are done using > protocol A and experiment 3 and 4 are done using protocol B. Experiments 1 > and 3 are done using stem cells and experiment 2 and 4 are done using neural > cells. I would like to see if there is more variation between the two types > of protocols compared to the two types of cells. I have used the DESEQ > package to plot the squared coefficient of variation against the base mean > but I am wondering if there is a single metric I can use to compare the > variations? Here's a simple/gross test: what if you calculate the gene expression (let's say rpkm) of the genes in each dataset then cluster the samples. Do the samples cluster/pair-off by cell line, or by protocol? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact

ADD COMMENT • link 14.2 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 9 days ago

EMBL European Molecular Biology Laborat…

Hi Pete 1. how do the two SCV curves look like if you plot them into one panel? 2. as Steve suggests, you could compute the 4x4 distance matrix between all pairs of experiments, and perhaps visualise the distances with multidimensional scaling or dendrogram / hierarchical clustering. For this, I'd use the variance stabilising transformation as described in Section 7 ("Sample Clustering") of the vignette, or the man page of the "getVarianceStabilizedData function". Best wishes Wolfgang Il giorno Feb 19, 2010, alle ore 7:23 PM, Pete Shepard ha scritto: Hi All, I am comparing four RNAseq experiments, exp # 1 and 2 are done using protocol A and experiment 3 and 4 are done using protocol B. Experiments 1 and 3 are done using stem cells and experiment 2 and 4 are done using neural cells. I would like to see if there is more variation between the two types of protocols compared to the two types of cells. I have used the DESEQ package to plot the squared coefficient of variation against the base mean but I am wondering if there is a single metric I can use to compare the variations? [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Wolfgang Huber whuber at embl.de

ADD COMMENT • link 14.2 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

Thanks Wolfgang, The distance matrix works nicely however, I am having trouble plotting the two scv curves to one panel, any suggestions. I can give to cds "cds <- newCountDataSet( countsTable, conds )" a set of count data based on one set of conditions conds <- c("Old", "Old", "New", "New" ) obtain the variance " cds <- estimateVarianceFunctions( cds )" and plot them scvPlot(cds) I can then change the conds <- c("Stem", "Stem", "Neuron", "Neuron" ) and repeat for this the same steps as above. But, I am having trouble plotting the two sets of variances against eachother, on the same graph? P On Fri, Feb 19, 2010 at 2:35 PM, Wolfgang Huber <whuber@embl.de> wrote: > Hi Pete > > 1. how do the two SCV curves look like if you plot them into one panel? > 2. as Steve suggests, you could compute the 4x4 distance matrix between all > pairs of experiments, and perhaps visualise the distances with > multidimensional scaling or dendrogram / hierarchical clustering. For this, > I'd use the variance stabilising transformation as described in Section 7 > ("Sample Clustering") of the vignette, or the man page of the > "getVarianceStabilizedData function". > > Best wishes > Wolfgang > > Il giorno Feb 19, 2010, alle ore 7:23 PM, Pete Shepard ha scritto: > > Hi All, > > I am comparing four RNAseq experiments, exp # 1 and 2 are done using > protocol A and experiment 3 and 4 are done using protocol B. Experiments 1 > and 3 are done using stem cells and experiment 2 and 4 are done using > neural > cells. I would like to see if there is more variation between the two types > of protocols compared to the two types of cells. I have used the DESEQ > package to plot the squared coefficient of variation against the base mean > but I am wondering if there is a single metric I can use to compare the > variations? > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > Wolfgang Huber > whuber@embl.de > > > > [[alternative HTML version deleted]]

ADD REPLY • link 14.2 years ago Pete Shepard ▴ 240

0

Entering edit mode

Hi Pete On Wed, 24 Feb 2010 07:19:55 -0800, Pete Shepard <peter.shepard at="" gmail.com=""> wrote: > The distance matrix works nicely however, I am having trouble plotting the > two scv curves to one panel, any suggestions. I can give to cds "cds <- > newCountDataSet( countsTable, conds )" a set of count data based on one set > of conditions conds <- c("Old", "Old", "New", "New" ) > obtain the variance " cds <- estimateVarianceFunctions( cds )" and plot > them > scvPlot(cds) > > I can then change the conds <- c("Stem", "Stem", "Neuron", "Neuron" ) and > repeat for this the same steps as above. But, I am having trouble plotting > the two sets of variances against eachother, on the same graph? If you are only interested in the raw variance (i.e., trhe solid lines in the SCV plot, which show the variance from sample differences, without the shot noise coming from the counting), you can easily make the plot without using scvPlot and the customize it to your liking. Just use 'rawVarFunc' to get raw variance estimates for a condition, as in this example: library( DESeq ) cds <- makeExampleCountDataSet( ) cds <- estimateSizeFactors( cds ) cds < -estimateVarianceFunctions( cds ) xg <- 10^seq( 0, 3, length.out=100 ) plot( xg, rawVarFunc( cds, "A" )( xg ) / xg^2, log="x", type='l', ylim=c(0,.5) ) lines( xg, rawVarFunc( cds, "B" )( xg ) / xg^2, log="x", col="red" ) BTW, when you made the distance matrix, have you estimated the variances with 'pool=TRUE'? This is crucial, as otherwise you have a bias towards the sample pairing that you specified for the conditions. (I hope I mention this fact in the vignette and the help page.) Cheers Simon > > P > > On Fri, Feb 19, 2010 at 2:35 PM, Wolfgang Huber <whuber at="" embl.de=""> wrote: > >> Hi Pete >> >> 1. how do the two SCV curves look like if you plot them into one panel? >> 2. as Steve suggests, you could compute the 4x4 distance matrix between >> all >> pairs of experiments, and perhaps visualise the distances with >> multidimensional scaling or dendrogram / hierarchical clustering. For >> this, >> I'd use the variance stabilising transformation as described in Section 7 >> ("Sample Clustering") of the vignette, or the man page of the >> "getVarianceStabilizedData function". >> >> Best wishes >> Wolfgang >> >> Il giorno Feb 19, 2010, alle ore 7:23 PM, Pete Shepard ha scritto: >> >> Hi All, >> >> I am comparing four RNAseq experiments, exp # 1 and 2 are done using >> protocol A and experiment 3 and 4 are done using protocol B. Experiments >> 1 >> and 3 are done using stem cells and experiment 2 and 4 are done using >> neural >> cells. I would like to see if there is more variation between the two >> types >> of protocols compared to the two types of cells. I have used the DESEQ >> package to plot the squared coefficient of variation against the base >> mean >> but I am wondering if there is a single metric I can use to compare the >> variations?

ADD REPLY • link 14.2 years ago Simon Anders ★ 3.7k

0

Entering edit mode

Hi Pete On Wed, 24 Feb 2010 07:19:55 -0800, Pete Shepard <peter.shepard at="" gmail.com=""> wrote: > The distance matrix works nicely however, I am having trouble plotting the > two scv curves to one panel, any suggestions. I can give to cds "cds <- > newCountDataSet( countsTable, conds )" a set of count data based on one set > of conditions conds <- c("Old", "Old", "New", "New" ) > obtain the variance " cds <- estimateVarianceFunctions( cds )" and plot > them > scvPlot(cds) > > I can then change the conds <- c("Stem", "Stem", "Neuron", "Neuron" ) and > repeat for this the same steps as above. But, I am having trouble plotting > the two sets of variances against eachother, on the same graph? If you are only interested in the raw variance (i.e., trhe solid lines in the SCV plot, which show the variance from sample differences, without the shot noise coming from the counting), you can easily make the plot without using scvPlot and the customize it to your liking. Just use 'rawVarFunc' to get raw variance estimates for a condition, as in this example: library( DESeq ) cds <- makeExampleCountDataSet( ) cds <- estimateSizeFactors( cds ) cds < -estimateVarianceFunctions( cds ) xg <- 10^seq( 0, 3, length.out=100 ) plot( xg, rawVarFunc( cds, "A" )( xg ) / xg^2, log="x", type='l', ylim=c(0,.5) ) lines( xg, rawVarFunc( cds, "B" )( xg ) / xg^2, log="x", col="red" ) BTW, when you made the distance matrix, have you estimated the variances with 'pool=TRUE'? This is crucial, as otherwise you have a bias towards the sample pairing that you specified for the conditions. (I hope I mention this fact in the vignette and the help page.) Cheers Simon > > P > > On Fri, Feb 19, 2010 at 2:35 PM, Wolfgang Huber <whuber at="" embl.de=""> wrote: > >> Hi Pete >> >> 1. how do the two SCV curves look like if you plot them into one panel? >> 2. as Steve suggests, you could compute the 4x4 distance matrix between >> all >> pairs of experiments, and perhaps visualise the distances with >> multidimensional scaling or dendrogram / hierarchical clustering. For >> this, >> I'd use the variance stabilising transformation as described in Section 7 >> ("Sample Clustering") of the vignette, or the man page of the >> "getVarianceStabilizedData function". >> >> Best wishes >> Wolfgang >> >> Il giorno Feb 19, 2010, alle ore 7:23 PM, Pete Shepard ha scritto: >> >> Hi All, >> >> I am comparing four RNAseq experiments, exp # 1 and 2 are done using >> protocol A and experiment 3 and 4 are done using protocol B. Experiments >> 1 >> and 3 are done using stem cells and experiment 2 and 4 are done using >> neural >> cells. I would like to see if there is more variation between the two >> types >> of protocols compared to the two types of cells. I have used the DESEQ >> package to plot the squared coefficient of variation against the base >> mean >> but I am wondering if there is a single metric I can use to compare the >> variations?

ADD REPLY • link 14.2 years ago Simon Anders ▴ 150

Login before adding your answer.