stageR NA in getAdjustedPValues() (method=dtu)
1
1
Entering edit mode
fiona.dick91 ▴ 60
@fionadick91-16521
Last seen 23 months ago
Norway

Hello,

Im working with DRIMSeq and stageR to analyse DTU from RNASeq transcript counts estimated by salmon.

I have been following DRIMSeq and stagR's vignette. Importing my reads with tximport, using scaledTPM as scaling method. Filtering transcripts according to recommendations and calculating precision , fitting and testing with drimseq.

I created my design matrix like it was suggested in the DRIMSeq vignette in section 5.3, to include a batch effect variable.

I had no trouble going through those steps, and when looking at the results i can see various adjusted pvalues below a significance level of e.g. 0.05, on gene as well as transcript level.

When I moved on to the 2 stage testing using stageR as suggested, I end up with only NA' s in the gene and transcript column after calling the getAdjustedPValues() function. In the vignette is stated, that this means that no genes have passed the first stage screening and are thus turned to NA. Now I am wondering about three things:

  • did I maybe use the wrong design matrix and end up with wrong DRIMSeq results. This thought came up because when I use the accessor function DRIMSeq::design() on my DRIMSeq object it returns another design matrix compared to what I put in. When running DRIMSeq it reported VERY high precision (~400-600), does that make sens?
  • Even if none of the genes passed the first screening test in stageR, when I call the function getAdjustedPValues() with the option of only returning significant genes, I get an error message, while it should probably just return an empty list?!
  • Concerning the fact that I get a long list of genes with adjusted pvalues below some alpha from the DRIMSeq results, and none after applying stageR, I wonder if I might have misunderstood the function's intention. From the stageR vignette section 3.2, I understand that if the stageRTx() is generated with the option of defining pScreen as already adjusted, the function call of stageWiseAdjustment() would just check which of those already adjusted pvalues (qvalues) in pScreen pass the threshold, and then move on to the transcript level test/adjustment (FWER) including only those genes that passed the screening.

Code:

txi <- tximport(files_perCohort,type=quant_tool,txOut=TRUE,countsFromAbundance=scale_method)
d <- dmDSdata(counts=countData[[i]],samples=info[[i]])
n.small<-min(table(DRIMSeq::samples(d)$condition))
d<-dmFilter(d,
                            min_samps_feature_expr=n.small,min_feature_expr=5,
                           min_samps_feature_prop=n.small,min_feature_prop=0.1,
                            min_samps_gene_expr=n*0.7,min_gene_expr=10)
design_full <- model.matrix(~condition+cov,data=DRIMSeq::samples(d))

head(design_full)

  (Intercept) condition1 cov

Sid1           1         0      3.1

Sid2           1         1      5.6

Sid3           1         0      5.1

Sid4           1         1      5.9

Sid4           1         0      5.6

Sid5          1          1      6.0

head(DRIMSeq::design(d))

 (Intercept)          cov

Sid1           1         3.1

Sid2           1         5.6

Sid3           1         5.1

Sid4           1         5.9

Sid4           1         5.6

Sid5          1          6.0

res<-DRIMSeq::results(d)

pScreen <- **res$pvalue**
pConfirmation <- matrix(res$txp$pvalue,ncol=1)
rownames(pConfirmation) <- res$txp$feature_id
tx2gene <- res$txp[,c("feature_id","gene_id")]
stageRObj <- stageRTx(pScreen=pScreen,pConfirmation=pConfirmation,**pScreenAdjusted=F
ALSE**,tx2gene=tx2gene)
stageRObj <- stageWiseAdjustment(stageRObj, method="dtu", alpha=0.05)
drim.padj <- getAdjustedPValues(stageRObj,order=TRUE,onlySignificantGenes=TRUE)
drim.padj

Error in alpha * 100 : non-numeric argument to binary operator                                                  

In addition: Warning message:                                                                                   

The returned adjusted p-values are based on a                                 

                 stage-wise testing approach and are only valid for                                             

                 the provided target OFDR level of 5%. If a different target OFDR level is of interest,         

                 the entire adjustment should be re-run.          

traceback()

5: paste0("No genes were found to be significant on a ", alpha *                                                

       100, "% OFDR level.")                                                                                    

4: message(paste0("No genes were found to be significant on a ",                                                

       alpha * 100, "% OFDR level."))                                         

3: .getAdjustedPTx(object = object, onlySignificantGenes, order,                                                

       ...)                                                                                                     

2: getAdjustedPValues(Ss[[1]], order = TRUE, onlySignificantGenes = TRUE)                                       

1: getAdjustedPValues(Ss[[1]], order = TRUE, onlySignificantGenes = TRUE)     

 

drim.padj<-getAdjustedPValues(stageRObj,order=TRUE,onlySignificantGenes=FALSE)
nrow(drim.padj) == sum(is.na(drim.padj)[,"transcript"]))

TRUE

nrow(drim.padj)== sumis.na(drim.padj)[,"gene"]))

TRUE

#

Doing the same thing but passing the adjusted pvalues from DRIMSeq result to pScreen yields the same results and error message

pScreen <- **res$adj_pvalue**
stageRTx(pScreen=pScreen,pConfirmation=pConfirmation,**pScreenAdjusted=TRUE**,tx2gene=tx2gene)
stageRObj <- stageWiseAdjustment(stageRObj, method="dtu", alpha=0.05)
drim.padj <- getAdjustedPValues(stageRObj,order=TRUE,onlySignificantGenes=TRUE)

#

sessionInfo()

R version 3.5.1 (2018-07-02)

Platform: x86_64-pc-linux-gnu (64-bit)

Running under: Ubuntu 14.04.5 LTS

Matrix products: default

BLAS: /usr/lib/libblas/libblas.so.3.0

LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale:

 [1] LC_CTYPE=en_DK.UTF-8       LC_NUMERIC=C               LC_TIME=en_DK.UTF-8        LC_COLLATE=en_DK.UTF-8

 [5] LC_MONETARY=en_DK.UTF-8    LC_MESSAGES=en_DK.UTF-8    LC_PAPER=en_DK.UTF-8       LC_NAME=C

 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_DK.UTF-8 LC_IDENTIFICATION=C

attached base packages:

[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base

other attached packages:

 [1] GenomicFeatures_1.32.0      AnnotationDbi_1.42.1        tximport_1.8.0

 [4] stageR_1.2.22               SummarizedExperiment_1.10.1 DelayedArray_0.6.1

 [7] BiocParallel_1.14.2         matrixStats_0.53.1          Biobase_2.40.0

[10] GenomicRanges_1.32.4        GenomeInfoDb_1.16.0         IRanges_2.14.10

[13] S4Vectors_0.18.3            BiocGenerics_0.26.0         DRIMSeq_1.8.0

[16] forcats_0.3.0               stringr_1.2.0               dplyr_0.7.6

[19] purrr_0.2.5                 tidyr_0.8.1                 tibble_1.4.2

[22] ggplot2_2.2.1               tidyverse_1.2.1             readr_1.1.1

[25] nvimcom_0.9-58

drimseq stageR dtu salmon rna seq • 2.0k views
ADD COMMENT
0
Entering edit mode

Thank you for reporting this.

There used to be a problem with getAdjustedPValues in the older versions of stageR. It would be great if you could update your version with

devtools::install_github("statOmics/stageR")

and check if the error still occurs.

Would you mind sharing (some of) your data that reproduces the error such that I can look into this in detail?

ADD REPLY
0
Entering edit mode

thanks for the quick reply. I will update the package and try with the new version. How could I share a subset of my data to make as much sense as possible? would you like to have the imported and scaled abundances of a subset of transcripts and a subset of samples?

ADD REPLY
0
Entering edit mode

I am not yet very familiar with DRIMSeq, but it would be great if you could share your `d` object after running DRIMSeq. From browsing through your code it seems like that should contain most of what I need to run stageR and check where things went wrong. You could for example send it to koen.vandenberge@ugent.be through WeTransfer or share a Dropbox link here on the forum.

ADD REPLY
0
Entering edit mode

I sent you the result object from the DRIMSeq analysis.

the data for stageR would then be:

pScreen <- res$res$adj_pvalue

pConfirmation <- res$txp$adj_pvalue

tx2gene <- res$res[,c("feature_id","gene_id")]

 

ADD REPLY
0
Entering edit mode

the error message persists, after updating as you recommended.

 

ADD REPLY
0
Entering edit mode

Hello Fiona, I am very sorry for my late reply. You have mentioned that the design matrix from DRIMSeq output is different than what you feed in. Would you like that I have a look on that? If yes, would share with me the d object after running the analysis? You could send it to gosia.nowicka@uzh.ch or via Dropbox.

ADD REPLY
0
Entering edit mode

Hi! sorry for the late answer I totally forgot about this thread. It was my own mistake because I called the design matrix of the reduced model...So all is good. Thanks for the help

ADD REPLY
1
Entering edit mode
@koen-van-den-berge-6369
Last seen 5 months ago
Ghent University, Belgium

Hi Fiona,

I found out that the error stems back from the gene-level p-values that do not carry the names of the genes that they are representing. In your case, the error can be solved by setting

names(pScreen)=res[[2]]$gene_id

before inputting pScreen in stageR.

I will now make sure to give a clear error message when this does not happen in the construction of the stageRTx class to prevent this from happening in the future.

ADD COMMENT
0
Entering edit mode

perfect, I tried it right away and it works. Thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6