Im working with DRIMSeq and stageR to analyse DTU from RNASeq transcript counts estimated by salmon.
I have been following DRIMSeq and stagR's vignette. Importing my reads with tximport, using scaledTPM as scaling method. Filtering transcripts according to recommendations and calculating precision , fitting and testing with drimseq.
I created my design matrix like it was suggested in the DRIMSeq vignette in section 5.3, to include a batch effect variable.
I had no trouble going through those steps, and when looking at the results i can see various adjusted pvalues below a significance level of e.g. 0.05, on gene as well as transcript level.
When I moved on to the 2 stage testing using stageR as suggested, I end up with only NA' s in the gene and transcript column after calling the getAdjustedPValues() function. In the vignette is stated, that this means that no genes have passed the first stage screening and are thus turned to NA. Now I am wondering about three things:
- did I maybe use the wrong design matrix and end up with wrong DRIMSeq results. This thought came up because when I use the accessor function DRIMSeq::design() on my DRIMSeq object it returns another design matrix compared to what I put in. When running DRIMSeq it reported VERY high precision (~400-600), does that make sens?
- Even if none of the genes passed the first screening test in stageR, when I call the function getAdjustedPValues() with the option of only returning significant genes, I get an error message, while it should probably just return an empty list?!
- Concerning the fact that I get a long list of genes with adjusted pvalues below some alpha from the DRIMSeq results, and none after applying stageR, I wonder if I might have misunderstood the function's intention. From the stageR vignette section 3.2, I understand that if the stageRTx() is generated with the option of defining pScreen as already adjusted, the function call of stageWiseAdjustment() would just check which of those already adjusted pvalues (qvalues) in pScreen pass the threshold, and then move on to the transcript level test/adjustment (FWER) including only those genes that passed the screening.
txi <- tximport(files_perCohort,type=quant_tool,txOut=TRUE,countsFromAbundance=scale_method) d <- dmDSdata(counts=countData[[i]],samples=info[[i]]) n.small<-min(table(DRIMSeq::samples(d)$condition)) d<-dmFilter(d, min_samps_feature_expr=n.small,min_feature_expr=5, min_samps_feature_prop=n.small,min_feature_prop=0.1, min_samps_gene_expr=n*0.7,min_gene_expr=10) design_full <- model.matrix(~condition+cov,data=DRIMSeq::samples(d)) head(design_full)
(Intercept) condition1 cov
Sid1 1 0 3.1
Sid2 1 1 5.6
Sid3 1 0 5.1
Sid4 1 1 5.9
Sid4 1 0 5.6
Sid5 1 1 6.0
Sid1 1 3.1
Sid2 1 5.6
Sid3 1 5.1
Sid4 1 5.9
Sid4 1 5.6
Sid5 1 6.0
res<-DRIMSeq::results(d) pScreen <- **res$pvalue** pConfirmation <- matrix(res$txp$pvalue,ncol=1) rownames(pConfirmation) <- res$txp$feature_id tx2gene <- res$txp[,c("feature_id","gene_id")] stageRObj <- stageRTx(pScreen=pScreen,pConfirmation=pConfirmation,**pScreenAdjusted=F ALSE**,tx2gene=tx2gene) stageRObj <- stageWiseAdjustment(stageRObj, method="dtu", alpha=0.05) drim.padj <- getAdjustedPValues(stageRObj,order=TRUE,onlySignificantGenes=TRUE) drim.padj
Error in alpha * 100 : non-numeric argument to binary operator
In addition: Warning message:
The returned adjusted p-values are based on a
stage-wise testing approach and are only valid for
the provided target OFDR level of 5%. If a different target OFDR level is of interest,
the entire adjustment should be re-run.
5: paste0("No genes were found to be significant on a ", alpha *
100, "% OFDR level.")
4: message(paste0("No genes were found to be significant on a ",
alpha * 100, "% OFDR level."))
3: .getAdjustedPTx(object = object, onlySignificantGenes, order,
2: getAdjustedPValues(Ss[], order = TRUE, onlySignificantGenes = TRUE)
1: getAdjustedPValues(Ss[], order = TRUE, onlySignificantGenes = TRUE)
nrow(drim.padj) == sum(is.na(drim.padj)[,"transcript"]))
Doing the same thing but passing the adjusted pvalues from DRIMSeq result to pScreen yields the same results and error message
pScreen <- **res$adj_pvalue** stageRTx(pScreen=pScreen,pConfirmation=pConfirmation,**pScreenAdjusted=TRUE**,tx2gene=tx2gene) stageRObj <- stageWiseAdjustment(stageRObj, method="dtu", alpha=0.05) drim.padj <- getAdjustedPValues(stageRObj,order=TRUE,onlySignificantGenes=TRUE)
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
Matrix products: default
 LC_CTYPE=en_DK.UTF-8 LC_NUMERIC=C LC_TIME=en_DK.UTF-8 LC_COLLATE=en_DK.UTF-8
 LC_MONETARY=en_DK.UTF-8 LC_MESSAGES=en_DK.UTF-8 LC_PAPER=en_DK.UTF-8 LC_NAME=C
 LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_DK.UTF-8 LC_IDENTIFICATION=C
attached base packages:
 parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
 GenomicFeatures_1.32.0 AnnotationDbi_1.42.1 tximport_1.8.0
 stageR_1.2.22 SummarizedExperiment_1.10.1 DelayedArray_0.6.1
 BiocParallel_1.14.2 matrixStats_0.53.1 Biobase_2.40.0
 GenomicRanges_1.32.4 GenomeInfoDb_1.16.0 IRanges_2.14.10
 S4Vectors_0.18.3 BiocGenerics_0.26.0 DRIMSeq_1.8.0
 forcats_0.3.0 stringr_1.2.0 dplyr_0.7.6
 purrr_0.2.5 tidyr_0.8.1 tibble_1.4.2
 ggplot2_2.2.1 tidyverse_1.2.1 readr_1.1.1