DEXSeqHTML Output Trouble
0
0
Entering edit mode
skasowitz • 0
@skasowitz-9938
Last seen 8.0 years ago

Hello,

I am having difficulty producing output files with DEXSeqHTML. Only <gene_name>expression.html and <gene_name>results.html files are created, and only the latter have any contents (a data table).

I can generate plots for individual genes by calling

plotDEXSeq(dxr1,"NM_007440",legend=TRUE,cex.axis=1.2,cex=1.3,lwd=2, displayTranscripts = TRUE)

I was able to generate the HTML output following the vignette with the Pasilla dataset. However, when calling DEXSeqHTML in a similar manner with my own data I am only finding the files noted above.

DEXSeqHTML(dxr1, FDR=0.05,color=c("#FF00080","#0000FF80"),fitExpToVar="condition",mart = mart,filter="refseq_mrna" , attributes="external_gene_name")

I receive the following traceback 

> traceback()
2: stop(sprintf("The value of the parameter fitExpToVar, '%s', is not a variable from the annotation of the samples. Please specify a column name of the colData slot of the DEXSeqDataSet object.", 
       fitExpToVar))
1: DEXSeqHTML(dxr1, FDR = 0.05, color = c("#FF00080", "#0000FF80"), 
       fitExpToVar = "treatment", mart = mart, filter = "refseq_mrna", 
       attributes = "external_gene_name")

and warnings (truncated, all 50 are similar)

Warning messages:
1: closing unused connection 13 (DEXSeqReport/files/NM_001002842expression.html)
2: closing unused connection 12 (DEXSeqReport/files/NM_001002764expression.html)
3: closing unused connection 11 (DEXSeqReport/files/NM_001002268expression.html)
4: closing unused connection 10 (DEXSeqReport/files/NM_001002241expression.html)
5: closing unused connection 9 (DEXSeqReport/files/NM_001002239expression.html)
6: closing unused connection 8 (DEXSeqReport/files/NM_001001883expression.html)
7: closing unused connection 7 (DEXSeqReport/files/NM_001001880expression.html)
...

The complete workflow I am following is essentially straight from the vignette as I am just beginning to use DEXSeq.

countFiles <- list.files(pattern="count.txt$",full.names=TRUE)
flattenedFile <- list.files(pattern="gff$",full.names=TRUE)
sampleTable <- data.frame(
row.names = c("KO1","KO2","KO3","WT1","WT2","WT3"),
condition = c("knockout","knockout","knockout","wildtype","wildtype","wildtype")
)
dxd <- DEXSeqDataSetFromHTSeq(
countFiles,
sampleData=sampleTable,
design = ~ sample + exon + condition:exon,
flattenedfile = flattenedFile
)
sampleAnnotation(dxd)
dxd <- estimateSizeFactors(dxd)
dxd <- estimateDispersions(dxd)
dxd <- testForDEU(dxd)
dxd <- estimateExonFoldChanges (dxd,fitExpToVar="condition")
dxr1 <- DEXSeqResults(dxd)

 

LRT p-value: full vs reduced

DataFrame with 205335 rows and 13 columns
                       groupID   featureID exonBaseMean dispersion      stat    pvalue
                   <character> <character>    <numeric>  <numeric> <numeric> <numeric>
NM_001001130:E001 NM_001001130        E001    45.843785  0.2252586 0.1458363 0.7025466
NM_001001130:E002 NM_001001130        E002     3.558073  0.2312652 0.1098326 0.7403348
NM_001001130:E003 NM_001001130        E003     2.616500  0.1867738 0.4723285 0.4919179
NM_001001130:E004 NM_001001130        E004     1.371563  0.1070000 0.9595710 0.3272950
NM_001001152:E001 NM_001001152        E001     9.450050  0.0568492 2.1828880 0.1395524
...                        ...         ...          ...        ...       ...       ...
NM_213733:E012       NM_213733        E012     30.43564 0.04353082  3.033824 0.0815456
NR_003278:E001       NR_003278        E001      0.00000         NA        NA        NA
NR_003279:E001       NR_003279        E001      0.00000         NA        NA        NA
NR_003280:E001       NR_003280        E001      0.00000         NA        NA        NA
NR_046233:E001       NR_046233        E001      0.00000         NA        NA        NA
                       padj  knockout  wildtype log2fold_wildtype_knockout
                  <numeric> <numeric> <numeric>                  <numeric>
NM_001001130:E001 0.9565526  9.400194 10.128272                  0.1076256
NM_001001130:E002 0.9643624  2.788459  3.298784                  0.2424661
NM_001001130:E003 0.8903618  3.026732  2.667826                 -0.1820963
NM_001001130:E004 0.7989704  2.465953  1.828341                 -0.4316096
NM_001001152:E001 0.5876139  4.748336  4.220563                 -0.1699867
...                     ...       ...       ...                        ...
NM_213733:E012    0.4639771  6.392214  7.311873                  0.1939251
NR_003278:E001           NA        NA        NA                         NA
NR_003279:E001           NA        NA        NA                         NA
NR_003280:E001           NA        NA        NA                         NA
NR_046233:E001           NA        NA        NA                         NA
                                       genomicData    countData transcripts
                                         <GRanges>     <matrix>      <list>
NM_001001130:E001     chr13:-:[67747800, 67749722] 62 17 11 ...    ########
NM_001001130:E002     chr13:-:[67751610, 67751705]    7 0 0 ...    ########
NM_001001130:E003     chr13:-:[67752388, 67752514]    7 1 0 ...    ########
NM_001001130:E004     chr13:-:[67755063, 67755134]    4 1 0 ...    ########
NM_001001152:E001     chr13:-:[67254918, 67258144]    9 2 2 ...    ########
...                                            ...          ...         ...
NM_213733:E012       chr2:+:[174122091, 174122702] 50 64 54 ...    ########
NR_003278:E001      Rn18s:+:[        1,      1870]    0 0 0 ...    ########
NR_003279:E001     Rn28s1:+:[        1,      4730]    0 0 0 ...    ########
NR_003280:E001    Rn5-8s1:+:[        1,       157]    0 0 0 ...    ########
NR_046233:E001      Rn45s:+:[        1,     13404]    0 0 0 ...    ########

My session info is as follows

R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.2 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Cairo_1.5-9                biomaRt_2.26.1             DEXSeq_1.16.10            
 [4] DESeq2_1.10.1              RcppArmadillo_0.6.500.4.0  Rcpp_0.12.3               
 [7] SummarizedExperiment_1.0.2 GenomicRanges_1.22.4       GenomeInfoDb_1.6.3        
[10] IRanges_2.4.8              S4Vectors_0.8.11           Biobase_2.30.0            
[13] BiocGenerics_0.16.1        BiocParallel_1.4.3        

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3           XVector_0.10.0      
 [5] bitops_1.0-6         futile.options_1.0.0 tools_3.2.3          zlibbioc_1.16.0     
 [9] statmod_1.4.24       rpart_4.1-10         RSQLite_1.0.0        annotate_1.48.0     
[13] gtable_0.2.0         lattice_0.20-33      DBI_0.3.1            gridExtra_2.2.1     
[17] stringr_1.0.0        hwriter_1.3.2        genefilter_1.52.1    cluster_2.0.3       
[21] Biostrings_2.38.4    locfit_1.5-9.1       grid_3.2.3           nnet_7.3-12         
[25] AnnotationDbi_1.32.3 XML_3.98-1.4         survival_2.38-3      foreign_0.8-66      
[29] latticeExtra_0.6-28  Formula_1.2-1        magrittr_1.5         geneplotter_1.48.0  
[33] ggplot2_2.1.0        lambda.r_1.1.7       Rsamtools_1.22.0     Hmisc_3.17-2        
[37] scales_0.4.0         splines_3.2.3        xtable_1.8-2         colorspace_1.2-6    
[41] stringi_1.0-1        acepack_1.3-3.3      RCurl_1.95-4.8       munsell_0.4.3     

 

Thanks for any advice

Seth

dexseq dexseqhtml • 2.0k views
ADD COMMENT
0
Entering edit mode

Hi Seth, 

Can you try

DEXSeqHTML(dxr1, FDR = 0.05, color = c("#FF00080", "#0000FF80"),
       fitExpToVar = "condition", mart = mart, filter = "refseq_mrna",
       attributes = "external_gene_name")



You are specifying "treatment" instead of "condition" for fitExpToVar, and "treatment" does not exist as a column of your colData. Will make sure you get an informative error, sorry about that!

ADD REPLY
0
Entering edit mode

Hi Alejandro,

Sorry about that, I copied over the incorrect line from that history. What you entered is what has produced the described error. I have also tried without entering the fitExpToVar parameter. When I tried with "treatment" it returned an error noting that the column does not exist. 

I did try it again anyways, and happened to have the output folder opened and visible. I noticed that an empty svg file was generated for each gene, but is deleted before the process finishes. I don't know if this is expected behaviour. The final output was the same as previously seen. There is no traceback, while the warnings listed are present.

testForDEU.html looks how I'd expect, with the experimental design and a table of genes. Each gene links to an expression.html which has nothing but links back and to multiple page which do not exist. It also links to the results.html page which has the expected table.

Sorry for the confusion, and thanks for the help.

ADD REPLY
0
Entering edit mode

After seeing the brief appearance of zero byte SVG files I checked dev.list() following the DEXSeqHTML call. There were 63 svg in the list. I cleared it with graphics.off() and reran DEXSeqHTML and the dev.list is repopulated. I'm guessing now that there is something funny about my installation or environment, but it isn't apparent to me just what is going on.

ADD REPLY
0
Entering edit mode

I have tried the same workflow on my notebook, using a fresh installation of R 3.2.4 and the various packages, and I am experiencing the same issue with DEXSeqHTML. I briefly see several hundred 0 byte svg files created and then removed, while the function is running. Afterwards dev.list() shows 63 svg devices attached, presumably the first 63 created, and then no further devices can be attached. 

> sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] hwriter_1.3.2              biomaRt_2.26.1             DEXSeq_1.16.10            
 [4] DESeq2_1.10.1              RcppArmadillo_0.6.600.4.0  Rcpp_0.12.3               
 [7] SummarizedExperiment_1.0.2 GenomicRanges_1.22.4       GenomeInfoDb_1.6.3        
[10] IRanges_2.4.8              S4Vectors_0.8.11           Biobase_2.30.0            
[13] BiocGenerics_0.16.1        BiocParallel_1.4.3         BiocInstaller_1.20.1      

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3           XVector_0.10.0      
 [5] bitops_1.0-6         futile.options_1.0.0 tools_3.2.4          zlibbioc_1.16.0     
 [9] statmod_1.4.24       rpart_4.1-10         RSQLite_1.0.0        annotate_1.48.0     
[13] gtable_0.2.0         lattice_0.20-33      DBI_0.3.1            gridExtra_2.2.1     
[17] stringr_1.0.0        genefilter_1.52.1    cluster_2.0.3        Biostrings_2.38.4   
[21] locfit_1.5-9.1       grid_3.2.4           nnet_7.3-12          AnnotationDbi_1.32.3
[25] XML_3.98-1.4         survival_2.38-3      foreign_0.8-66       latticeExtra_0.6-28 
[29] Formula_1.2-1        magrittr_1.5         geneplotter_1.48.0   ggplot2_2.1.0       
[33] lambda.r_1.1.7       Rsamtools_1.22.0     Hmisc_3.17-2         scales_0.4.0        
[37] splines_3.2.4        xtable_1.8-2         colorspace_1.2-6     stringi_1.0-1       
[41] acepack_1.3-3.3      RCurl_1.95-4.8       munsell_0.4.3    
ADD REPLY
0
Entering edit mode

Hi Skasowitz, 

Would you mind sending me your objects so I could have a closer look at the problem?

Alejandro

ADD REPLY
0
Entering edit mode

Hi Alejandro,

Yesterday I sent a link to some files via the email form at embl.de. Were you able to access the object?

 

Seth

ADD REPLY
0
Entering edit mode

Hi Seth, 

Thanks for checking! I got your dataset, but could not reproduce the behaviour that you were describing. I do get the report as expected. There are some glm fits that fail and a warning is issues, but these are only a couple of genes.  

How many empty files are you getting? Could you try the devel version with the same code?

Alejandro

ADD REPLY
0
Entering edit mode

Hi Alejandro,

Thanks for giving it a look. About how many genes are showing up in your results?

I see several hundred empty svg files created when I first start DEXSeqHTML. These files do not persist for more than a few seconds. When R is finished I have a directory with only expression.html and results.html files. There are 1929 pairs of files. Although only the results.html files contain anything besides links, the expression files do remain when the process is complete.

I installed R devel and found the same result. When I tried this on the lab computer I saw over 3000 pairs of html files are left when finished, so I am wondering if part of this is a resource issue. I will create some subsets of the full dataset to continue working this out. If nothing else I can test more quickly.

 

Seth

 

ADD REPLY

Login before adding your answer.

Traffic: 471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6