No significant p-values

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 9.6 years ago

Hello, I have constructed the following dataset for analysis using DESeq2: class: DESeqDataSet dim: 57396 10 exptData(0): assays(1): counts rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 ENSG00000210196 rowData metadata column names(0): colnames(10): 1 2 ... 10 11 colData names(1): condition > colData(ddsHTSeq) DataFrame with 10 rows and 1 column condition <factor> 1 na 2 na 3 Resistant 4 na 5 Resistant 6 Resistant 7 na 8 na 10 Sensitive 11 Sensitive I am interested in the differential expression between the drug resistant and sensitive samples ('na' are control samples). I've clustered the samples and plotted a PCA as described in the vignette. However, in each of these plots the samples do not cluster by their drug sensitivity but are distributed across the plot. I don't have any more information about the samples with which to model any potential covariates. I was wondering if there were any pointers as to how I could extract some useful meanings from these data please? As might be expected, when I try a DESeq on these data I get no significant p-values. Thanks in advance, Dave -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 [31] zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.

DESeq DESeq • 1.8k views

ADD COMMENT • link updated 9.8 years ago by Lucia Peixoto ▴ 330 • written 9.8 years ago by Guest User ★ 13k

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 22 hours ago

United States

hi Dave, If you don't find a set of genes with low FDR, then the experiment could have been underpowered to find the small differences, i.e. not enough sample size. Did you compare sensitive vs resistant using the contrast argument to results()? The default comparison is the last level of the first level of the last variable in the design, but there are three possible pairs of the three groups. Mike On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < guest@bioconductor.org> wrote: > Hello, > > I have constructed the following dataset for analysis using DESeq2: > > class: DESeqDataSet > dim: 57396 10 > exptData(0): > assays(1): counts > rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 > ENSG00000210196 > rowData metadata column names(0): > colnames(10): 1 2 ... 10 11 > colData names(1): condition > > > > colData(ddsHTSeq) > DataFrame with 10 rows and 1 column > condition > <factor> > 1 na > 2 na > 3 Resistant > 4 na > 5 Resistant > 6 Resistant > 7 na > 8 na > 10 Sensitive > 11 Sensitive > > I am interested in the differential expression between the drug resistant > and sensitive samples ('na' are control samples). > I've clustered the samples and plotted a PCA as described in the vignette. > However, in each of these plots the samples do not cluster by their drug > sensitivity but are distributed across the plot. I don't have any more > information about the samples with which to model any potential covariates. > I was wondering if there were any pointers as to how I could extract some > useful meanings from these data please? As might be expected, when I try a > DESeq on these data I get no significant p-values. > > Thanks in advance, > Dave > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 > [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 > [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 > [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 > [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 > [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 > [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 > [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 > [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 > [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 > [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 > [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 > [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 > [31] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > [[alternative HTML version deleted]]

ADD COMMENT • link 9.8 years ago Michael Love 41k

0

Entering edit mode

Thanks Mike - yes I used the sensitive vs resistant contrast argument to results() On 27 June 2014 14:39, Michael Love <michaelisaiahlove@gmail.com> wrote: > hi Dave, > > If you don't find a set of genes with low FDR, then the experiment could > have been underpowered to find the small differences, i.e. not enough > sample size. > > Did you compare sensitive vs resistant using the contrast argument to > results()? The default comparison is the last level of the first level of > the last variable in the design, but there are three possible pairs of the > three groups. > > Mike > > > On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < > guest@bioconductor.org> wrote: > >> Hello, >> >> I have constructed the following dataset for analysis using DESeq2: >> >> class: DESeqDataSet >> dim: 57396 10 >> exptData(0): >> assays(1): counts >> rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 >> ENSG00000210196 >> rowData metadata column names(0): >> colnames(10): 1 2 ... 10 11 >> colData names(1): condition >> >> >> > colData(ddsHTSeq) >> DataFrame with 10 rows and 1 column >> condition >> <factor> >> 1 na >> 2 na >> 3 Resistant >> 4 na >> 5 Resistant >> 6 Resistant >> 7 na >> 8 na >> 10 Sensitive >> 11 Sensitive >> >> I am interested in the differential expression between the drug resistant >> and sensitive samples ('na' are control samples). >> I've clustered the samples and plotted a PCA as described in the >> vignette. However, in each of these plots the samples do not cluster by >> their drug sensitivity but are distributed across the plot. I don't have >> any more information about the samples with which to model any potential >> covariates. >> I was wondering if there were any pointers as to how I could extract some >> useful meanings from these data please? As might be expected, when I try a >> DESeq on these data I get no significant p-values. >> >> Thanks in advance, >> Dave >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 >> [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 >> [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 >> [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 >> >> loaded via a namespace (and not attached): >> [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 >> [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 >> [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 >> [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 >> [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 >> [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 >> [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 >> [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 >> [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 >> [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 >> [31] zlibbioc_1.10.0 >> >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> > > [[alternative HTML version deleted]]

ADD REPLY • link 9.8 years ago Dave Wettmann ▴ 40

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < guest@bioconductor.org> wrote: > Hello, > > I have constructed the following dataset for analysis using DESeq2: > > class: DESeqDataSet > dim: 57396 10 > exptData(0): > assays(1): counts > rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 > ENSG00000210196 > rowData metadata column names(0): > colnames(10): 1 2 ... 10 11 > colData names(1): condition > > > > colData(ddsHTSeq) > DataFrame with 10 rows and 1 column > condition > <factor> > 1 na > 2 na > 3 Resistant > 4 na > 5 Resistant > 6 Resistant > 7 na > 8 na > 10 Sensitive > 11 Sensitive > > I am interested in the differential expression between the drug resistant > and sensitive samples ('na' are control samples). > I've clustered the samples and plotted a PCA as described in the vignette. > However, in each of these plots the samples do not cluster by their drug > sensitivity but are distributed across the plot. I don't have any more > information about the samples with which to model any potential covariates. > I was wondering if there were any pointers as to how I could extract some > useful meanings from these data please? As might be expected, when I try a > DESeq on these data I get no significant p-values. > Hi, Dave. With an n of only 5, you might simply be underpowered to find significant genes, so increasing your sample size might be warranted. You could try using gene set analysis to look for coordinately regulated sets of genes, each with small effects. Alternatively, you could use the p-values for ranking the genes and try to validate a few genes of interest on a larger set of samples using pcr or some other technology. Sean > > Thanks in advance, > Dave > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 > [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 > [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 > [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 > [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 > [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 > [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 > [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 > [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 > [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 > [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 > [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 > [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 > [31] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 9.8 years ago Sean Davis 21k

0

Entering edit mode

Lucia Peixoto ▴ 330

@lucia-peixoto-4203

Last seen 9.6 years ago

Hi Dave, If in your PCA your samples do not cluster by treatment, you likely have some sort of unwanted variation or batch effect masking the effect of the treatment in your data. I am not sure more samples will help. Have you taken a look at the PC loadings past 1 and 2 to see if there is any PC that captures your treatment? do you have any positive controls? are you sure your treatment actually causes measurable differences in gene expression? The only think I believe will help is RUVSeq: http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html Lucia On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < guest@bioconductor.org> wrote: > Hello, > > I have constructed the following dataset for analysis using DESeq2: > > class: DESeqDataSet > dim: 57396 10 > exptData(0): > assays(1): counts > rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 > ENSG00000210196 > rowData metadata column names(0): > colnames(10): 1 2 ... 10 11 > colData names(1): condition > > > > colData(ddsHTSeq) > DataFrame with 10 rows and 1 column > condition > <factor> > 1 na > 2 na > 3 Resistant > 4 na > 5 Resistant > 6 Resistant > 7 na > 8 na > 10 Sensitive > 11 Sensitive > > I am interested in the differential expression between the drug resistant > and sensitive samples ('na' are control samples). > I've clustered the samples and plotted a PCA as described in the vignette. > However, in each of these plots the samples do not cluster by their drug > sensitivity but are distributed across the plot. I don't have any more > information about the samples with which to model any potential covariates. > I was wondering if there were any pointers as to how I could extract some > useful meanings from these data please? As might be expected, when I try a > DESeq on these data I get no significant p-values. > > Thanks in advance, > Dave > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 > [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 > [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 > [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 > [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 > [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 > [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 > [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 > [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 > [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 > [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 > [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 > [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 > [31] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Lucia Peixoto PhD Postdoctoral Research Fellow Laboratory of Dr. Ted Abel Department of Biology School of Arts and Sciences University of Pennsylvania "Think boldly, don't be afraid of making mistakes, don't miss small details, keep your eyes open, and be modest in everything except your aims." Albert Szent-Gyorgyi [[alternative HTML version deleted]]

ADD COMMENT • link 9.8 years ago Lucia Peixoto ▴ 330

0

Entering edit mode

Hi, On Fri, Jun 27, 2014 at 7:06 AM, Lucia Peixoto <luciap at="" iscb.org=""> wrote: > Hi Dave, > > If in your PCA your samples do not cluster by treatment, you likely have > some sort of unwanted variation or batch effect masking the effect of the > treatment in your data. I am not sure more samples will help. > Have you taken a look at the PC loadings past 1 and 2 to see if there is > any PC that captures your treatment? do you have any positive controls? are > you sure your treatment actually causes measurable differences in gene > expression? > > The only think I believe will help is RUVSeq: > > http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html Not the only thing ... this is slightly different, but also something to keep an eye on "in this context" (ie. removing nuisance effects): svaseq: removing batch effects and other unwanted noise from sequencing data http://biorxiv.org/content/early/2014/06/25/006585 Thank you for bringing my attention to RUVSeq, though, as I haven't seen it before. HTH, -steve -- Steve Lianoglou Computational Biologist Genentech

ADD REPLY • link 9.8 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Thanks Lucia; I've checked that PCs 1 and 2 capture 55% and 15% of the total variance, respectively. Could you explain how, if I did find that the treatment effect was present in another PC, that would help me please? I don't have any positive control because it's an experiment to characterise a response to a drug treatment. Thanks, Dave On 27 June 2014 15:06, Lucia Peixoto <luciap@iscb.org> wrote: > Hi Dave, > > If in your PCA your samples do not cluster by treatment, you likely have > some sort of unwanted variation or batch effect masking the effect of the > treatment in your data. I am not sure more samples will help. > Have you taken a look at the PC loadings past 1 and 2 to see if there is > any PC that captures your treatment? do you have any positive controls? are > you sure your treatment actually causes measurable differences in gene > expression? > > The only think I believe will help is RUVSeq: > > http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html > > Lucia > > > On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < > guest@bioconductor.org> wrote: > >> Hello, >> >> I have constructed the following dataset for analysis using DESeq2: >> >> class: DESeqDataSet >> dim: 57396 10 >> exptData(0): >> assays(1): counts >> rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 >> ENSG00000210196 >> rowData metadata column names(0): >> colnames(10): 1 2 ... 10 11 >> colData names(1): condition >> >> >> > colData(ddsHTSeq) >> DataFrame with 10 rows and 1 column >> condition >> <factor> >> 1 na >> 2 na >> 3 Resistant >> 4 na >> 5 Resistant >> 6 Resistant >> 7 na >> 8 na >> 10 Sensitive >> 11 Sensitive >> >> I am interested in the differential expression between the drug resistant >> and sensitive samples ('na' are control samples). >> I've clustered the samples and plotted a PCA as described in the >> vignette. However, in each of these plots the samples do not cluster by >> their drug sensitivity but are distributed across the plot. I don't have >> any more information about the samples with which to model any potential >> covariates. >> I was wondering if there were any pointers as to how I could extract some >> useful meanings from these data please? As might be expected, when I try a >> DESeq on these data I get no significant p-values. >> >> Thanks in advance, >> Dave >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 >> [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 >> [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 >> [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 >> >> loaded via a namespace (and not attached): >> [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 >> [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 >> [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 >> [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 >> [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 >> [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 >> [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 >> [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 >> [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 >> [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 >> [31] zlibbioc_1.10.0 >> >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > Lucia Peixoto PhD > Postdoctoral Research Fellow > Laboratory of Dr. Ted Abel > Department of Biology > School of Arts and Sciences > University of Pennsylvania > > "Think boldly, don't be afraid of making mistakes, don't miss small > details, keep your eyes open, and be modest in everything except your > aims." > Albert Szent-Gyorgyi > [[alternative HTML version deleted]]

ADD REPLY • link 9.8 years ago Dave Wettmann ▴ 40

0

Entering edit mode

Hi Dave, I assume you have only plotted PC1 vs PC2, you can do the same type of plots PC1 vs PC3, PC1 vs PC4 and so on...., to see if any PC captures the grouping by treatment This is regardless of how much variance each PC explains. I usually don't use not DESeq to do the PCA plots, so I am not sure how you will do this within DESeq I do understand that you are "characterizing" a response to a drug, but your underlying assumption is that part of that response is differences in gene expression that can be observed at the time point your are measuring. It could simply be that the differences between being drug resistant and sensitive have nothing to do with gene expression differences at the steady state, and that's why you don't get any significant p-values. Positive controls assure you that there are differences you can measure. Have you plotted the p-value distribution? you can find how to do it in the Nature protocols tutorial: http://www.nature.com/nprot/journal/v8/n9/full/nprot.2013.099.html Lucia On Fri, Jun 27, 2014 at 11:41 AM, Dave Wettmann <david.wettmann@gmail.com> wrote: > Thanks Lucia; I've checked that PCs 1 and 2 capture 55% and 15% of the > total variance, respectively. Could you explain how, if I did find that > the treatment effect was present in another PC, that would help me please? > I don't have any positive control because it's an experiment to > characterise a response to a drug treatment. > Thanks, > Dave > > > On 27 June 2014 15:06, Lucia Peixoto <luciap@iscb.org> wrote: > >> Hi Dave, >> >> If in your PCA your samples do not cluster by treatment, you likely have >> some sort of unwanted variation or batch effect masking the effect of the >> treatment in your data. I am not sure more samples will help. >> Have you taken a look at the PC loadings past 1 and 2 to see if there is >> any PC that captures your treatment? do you have any positive controls? are >> you sure your treatment actually causes measurable differences in gene >> expression? >> >> The only think I believe will help is RUVSeq: >> >> http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html >> >> Lucia >> >> >> On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < >> guest@bioconductor.org> wrote: >> >>> Hello, >>> >>> I have constructed the following dataset for analysis using DESeq2: >>> >>> class: DESeqDataSet >>> dim: 57396 10 >>> exptData(0): >>> assays(1): counts >>> rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 >>> ENSG00000210196 >>> rowData metadata column names(0): >>> colnames(10): 1 2 ... 10 11 >>> colData names(1): condition >>> >>> >>> > colData(ddsHTSeq) >>> DataFrame with 10 rows and 1 column >>> condition >>> <factor> >>> 1 na >>> 2 na >>> 3 Resistant >>> 4 na >>> 5 Resistant >>> 6 Resistant >>> 7 na >>> 8 na >>> 10 Sensitive >>> 11 Sensitive >>> >>> I am interested in the differential expression between the drug >>> resistant and sensitive samples ('na' are control samples). >>> I've clustered the samples and plotted a PCA as described in the >>> vignette. However, in each of these plots the samples do not cluster by >>> their drug sensitivity but are distributed across the plot. I don't have >>> any more information about the samples with which to model any potential >>> covariates. >>> I was wondering if there were any pointers as to how I could extract >>> some useful meanings from these data please? As might be expected, when I >>> try a DESeq on these data I get no significant p-values. >>> >>> Thanks in advance, >>> Dave >>> >>> -- output of sessionInfo(): >>> >>> R version 3.1.0 (2014-04-10) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods >>> [8] base >>> >>> other attached packages: >>> [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 >>> [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 >>> [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 >>> [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 >>> [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 >>> [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 >>> [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 >>> [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 >>> [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 >>> [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 >>> [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 >>> [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 >>> [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 >>> [31] zlibbioc_1.10.0 >>> >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Lucia Peixoto PhD >> Postdoctoral Research Fellow >> Laboratory of Dr. Ted Abel >> Department of Biology >> School of Arts and Sciences >> University of Pennsylvania >> >> "Think boldly, don't be afraid of making mistakes, don't miss small >> details, keep your eyes open, and be modest in everything except your >> aims." >> Albert Szent-Gyorgyi >> > > -- Lucia Peixoto PhD Postdoctoral Research Fellow Laboratory of Dr. Ted Abel Department of Biology School of Arts and Sciences University of Pennsylvania "Think boldly, don't be afraid of making mistakes, don't miss small details, keep your eyes open, and be modest in everything except your aims." Albert Szent-Gyorgyi [[alternative HTML version deleted]]

ADD REPLY • link 9.8 years ago Lucia Peixoto ▴ 330

Login before adding your answer.