Search
Question: arrayQualityMetrics with DESeq2
0
gravatar for rachelwright8
3.6 years ago by
United States
rachelwright810 wrote:

Hi experts,

Does anyone have advice on how to use arrayQualityMetrics to identify sample outliers with DESeq data sets produced using DESeq2?  I'm having trouble generating a file that the "arrayQualityMetrics" function will accept using DESeq2 functions ("DESeq" or "varianceStabilizingTransformation" or "rlog"...).  

I've tried several things, including:

dds<-DESeqDataSetFromMatrix(countData=countData, colData=colData, design=design)

 

vsd=varianceStabilizingTransformation(dds, blind=T)

arrayQualityMetrics(vsd, intgroup=c("gen", "conditon"), force=T)

The error message tells me that arrayQualityMetrics can't "find an inherited method for function ‘platformspecific’ for signature ‘"SummarizedExperiment""... the standard output of the DESeq2 functions I'm calling.

My current work-around is to use DESeq(1) to generate the variance stabilized data, calling arrayQualityMetrics on that, and proceeding with DESeq2 omitting the sample outliers.  I worry about this strategy since DESeq2 estimates dispersions differently.

library(DESeq)
cds=newCountDataSet(countData,colData)
cds = estimateDispersions(estimateSizeFactors(cds), method="blind")
Vst = varianceStabilizingTransformation(cds)
arrayQualityMetrics(Vst, intgroup = c("gen", "condition"))

Any tips?

Thanks in advance!

Rachel

sessionInfo()

R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq_1.18.0               lattice_0.20-29            locfit_1.5-9.1            
 [4] Biobase_2.26.0             arrayQualityMetrics_3.22.0 DESeq2_1.6.3              
 [7] RcppArmadillo_0.4.600.0    Rcpp_0.11.3                GenomicRanges_1.18.3      
[10] GenomeInfoDb_1.2.4         IRanges_2.0.1              S4Vectors_0.4.0           
[13] BiocGenerics_0.12.1       

loaded via a namespace (and not attached):
 [1] acepack_1.3-3.3       affy_1.44.0           affyio_1.34.0         affyPLM_1.42.0       
 [5] annotate_1.44.0       AnnotationDbi_1.28.1  base64_1.1            base64enc_0.1-2      
 [9] BatchJobs_1.5         BBmisc_1.8            beadarray_2.16.0      BeadDataPackR_1.18.0 
[13] BiocInstaller_1.16.1  BiocParallel_1.0.0    Biostrings_2.34.1     brew_1.0-6           
[17] Cairo_1.5-5           checkmate_1.5.1       cluster_1.15.3        codetools_0.2-10     
[21] colorspace_1.2-4      DBI_0.3.1             digest_0.6.8          fail_1.2             
[25] foreach_1.4.2         foreign_0.8-62        Formula_1.1-2         gcrma_2.38.0         
[29] genefilter_1.48.1     geneplotter_1.44.0    ggplot2_1.0.0         grid_3.1.1           
[33] gridSVG_1.4-2         gtable_0.1.2          Hmisc_3.14-6          hwriter_1.3.2        
[37] illuminaio_0.8.0      iterators_1.0.7       latticeExtra_0.6-26   limma_3.22.1         
[41] MASS_7.3-37           munsell_0.4.2         nnet_7.3-8            plyr_1.8.1           
[45] preprocessCore_1.28.0 proto_0.3-10          RColorBrewer_1.1-2    reshape2_1.4.1       
[49] RJSONIO_1.3-0         rpart_4.1-8           RSQLite_1.0.0         scales_0.2.4         
[53] sendmailR_1.2-1       setRNG_2013.9-1       splines_3.1.1         stringr_0.6.2        
[57] survival_2.37-7       SVGAnnotation_0.93-1  tools_3.1.1           vsn_3.34.0           
[61] XML_3.98-1.1          xtable_1.7-4          XVector_0.6.0         zlibbioc_1.12.0   

 

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by rachelwright810
2
gravatar for Michael Love
3.6 years ago by
Michael Love19k
United States
Michael Love19k wrote:

The technical details of why it's not working out of the box:

Because DESeq's CountDataSet extended eSet, the arrayQualityMetrics methods worked automatically. 

DESeq2's data object DESeqDataSet extends SummarizedExperiment, which is a different class, although it's a similar "shape" of object: a matrix of values (accessed with exprs/assay) with column info (pData/colData) and row info (fData/rowData).

There was some talk on the developer list about making a simple conversion function, which would help here: 

https://stat.ethz.ch/pipermail/bioc-devel/2014-September/006249.html

I think the line of code which will work for you is the following:

e = ExpressionSet(assay(vsd), AnnotatedDataFrame(colData(as.data.frame(vsd))), AnnotatedDataFrame(as.data.frame(rowData(vsd))))

Then use 'e' for arrayQualityMetrics.

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Michael Love19k
1
gravatar for rachelwright8
3.6 years ago by
United States
rachelwright810 wrote:

Thanks, Michael!  Though I still had an object class error:

#Dr. Love's way
e=ExpressionSet(assay(vsd), AnnotatedDataFrame(colData(vsd)), AnnotatedDataFrame(rowData(vsd)))

#Error in (function (classes, fdef, mtable)  : 
#            unable to find an inherited method for function ‘AnnotatedDataFrame’ for signature "DataFrame", "missing"’

but I think I fixed it by using just the colData of my vsd object (where samples, "gen", and "condition" are listed) and specifying it as a data.frame (for AnnotatedDataFrame()):

e=ExpressionSet(assay(vsd), AnnotatedDataFrame(as.data.frame(colData(vsd))))

arrayQualityMetrics(e, intgroup=c("gen", "condition"), force=T)

 

This worked and gave me the exact same results in arrayQualityMetrics as I was getting with DESeq, so that was a nice sanity check.  I didn't anticipate them to change much/at all.

Could you confirm that my solution wasn't a naughty thing to do?

Thanks again!

R

ADD COMMENTlink written 3.6 years ago by rachelwright810

Perfect. I forgot you would need to add as.data.frame. good save.

ADD REPLYlink written 3.6 years ago by Michael Love19k
0
gravatar for Wolfgang Huber
3.6 years ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:

Dear Rachel

this looks very reasonable. And arrayQualityMetrics does not need the featureData/rowData.

We really should provide something at the project / generic infrastructure level to perform these types of object conversions (esp. SummarizedExperiment -> ExpressionSet), as they can be confusing and arcane.

Wolfgang

 

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Wolfgang Huber13k
1

The 'multi assay' group has an implementation of this; I'll put moving this into the main stream on our tasks for the next sprint.

ADD REPLYlink written 3.6 years ago by Martin Morgan ♦♦ 22k
0
gravatar for rachelwright8
3.6 years ago by
United States
rachelwright810 wrote:

Thanks for your comments.  I'll keep an eye out for updates, but I appreciate all of your help developing a solution in the meantime.  

ADD COMMENTlink written 3.6 years ago by rachelwright810
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 183 users visited in the last hour