Genes that have NOT changed
1
0
Entering edit mode
evocanres • 0
@evocanres-17914
Last seen 12 months ago
United States

Hi

I am trying to follow through this post as an exercise to find genes that are not differentially expressed. Few things in this post does not work [anymore]. Can somehow help ?

1

Specifically,

1. DESeqSummarizedExperimentFromMatrix. #does not work
#Instead I use the `DESeqDataSetFromMatrix` to load the data and I get this error. 
#This is the warning I get. 

Note: levels of factors in the design contain characters other than
  letters, numbers, '_' and '.'. It is recommended (but not required) to use
  only letters, numbers, and delimiters '_' or '.', as these are safe characters
  for column names in R. [This is a message, not an warning or error]

More Importantly, the following command yeilds null.

betaSE <- mcols(rowData(dse))$SE_conditionuntreated

 

deseq2 • 1.2k views
ADD COMMENT
0
Entering edit mode

Have you copied all code from the old post verbatim (without *any* changes), or have you adapted it to your data? If the latter, please post the *full* code.

ADD REPLY
0
Entering edit mode

I tried to run the exact code ```{r} > # Let's use the example data from the pasilla package > library( DESeq2 ) > library( pasilla ) > data( "pasillaGenes" ) > > # Create a DESeq2 data object from the pasilla data > dse <- DESeqSummarizedExperimentFromMatrix( + counts(pasillaGenes), pData(pasillaGenes)[,c("condition","type")], + ~ type + condition ) Error in DESeqSummarizedExperimentFromMatrix(counts(pasillaGenes), pData(pasillaGenes)[, : could not find function "DESeqSummarizedExperimentFromMatrix" > > # Perform a standard DESeq2 analysis > dse <- DESeq(dse) Error in is(object, "DESeqDataSet") : object 'dse' not found > > # The log2 fold changes are found here > beta <- results(dse)$log2FoldChange Error in is(object, "DESeqDataSet") : object 'dse' not found > > # Just to make the following clearer, I should point out that the > # "results" accessor is just a short-cut for this access to the rowData: > all( beta == mcols(rowData(dse))$conditionuntreated, na.rm=TRUE ) Error in rowData(dse) : object 'dse' not found > # (returns TRUE) > > # The log fold change estimates all come with standard error information > # which we find in the rowData (maybe we should copy this to the > # 'results', too) > betaSE <- mcols(rowData(dse))$SE_conditionuntreated Error in rowData(dse) : object 'dse' not found > > # Internally, the Wald test is implemented as a simple two-sided > # z test of beta/betaSE. Two demonstrate this, we to the test > # manually and compare > pvalDE <- 2 * pnorm( abs( beta ), sd = betaSE, lower.tail=FALSE ) Error in abs(beta) : non-numeric argument to mathematical function > all( abs( pvalDE - results(dse)$pvalue ) < 1e-15, na.rm=TRUE ) Error: object 'pvalDE' not found > # (returns TRUE) > > # This was the test for DE, of course, i.e., small pvalDE means that > # the gene's expression change (the true value of beta) is not zero > > # What we want is the opposite, namely find gene, for which abs(beta) > # is smaller than some threshold, theta > theta <- .3 > > # So, we do our two one-sided tests. For a one-sided z test, we > # simply use tail probabilities from the normal distribution. > > # First, the test of H0_A: true_beta > thr > pA <- pnorm( beta, thr, betaSE, lower.tail=TRUE ) Error in pnorm(beta, thr, betaSE, lower.tail = TRUE) : object 'thr' not found > > # Next, the test of H0_B: true_beta < -thr > pB <- pnorm( beta, -thr, betaSE, lower.tail=FALSE ) Error in pnorm(beta, -thr, betaSE, lower.tail = FALSE) : object 'thr' not found > > # The overall p value is the maximum, because we want to reject H0_A > # and H0_B simultaneously > pvalTOST <- pmax( pA, pB ) Error in eval(quote(list(...)), env) : object 'pA' not found > > > # Let's adjust our two p values with BH: > sigDE <- p.adjust( pvalDE, "BH" ) < .1 Error in p.adjust(pvalDE, "BH") : object 'pvalDE' not found > sigSmall <- p.adjust( pvalTOST, "BH" ) < .1 Error in p.adjust(pvalTOST, "BH") : object 'pvalTOST' not found > > # And make an MA plot, with sigDE in red and sigSmall in green > plot( + rowMeans( counts(dse,normalized=TRUE) ), beta, + log="x", pch=20, cex=.2, + col = 1 + sigDE + 2*sigSmall ) Error in counts(dse, normalized = TRUE) : object 'dse' not found > # Plot is attached. > ``` So I changed few things. 1. ```{r} > dse <- DESeqDataSetFromMatrix( + counts(pasillaGenes), pData(pasillaGenes)[,c("condition","type")], + ~ type + condition ) Note: levels of factors in the design contain characters other than letters, numbers, '_' and '.'. It is recommended (but not required) to use only letters, numbers, and delimiters '_' or '.', as these are safe characters for column names in R. [This is a message, not an warning or error] ``` This runs okay, till I get to `betaSE <- mcols(rowData(dse))$conditionuntreated` 

ADD REPLY
0
Entering edit mode

2. If I can replace that with

betaSE <- rowData(dse)$SE_condition_untreated_vs_treated` because

all.equal(mcols(dse), rowData(dse))

gives

>TRUE

> pvalDE <- 2 * pnorm( abs( beta ), sd = betaSE, lower.tail=FALSE )

> all( abs( pvalDE - results(dse)$pvalue ) < 1e-15, na.rm=TRUE ) [1] TRUE

 3. I belive the `thr` is a typo, so  theta <- .3 # So, we do our two one-sided tests. For a one-sided z test, we # simply use tail probabilities from the normal distribution. # First, the test of H0_A: true_beta > thr pA <- pnorm( beta, theta, betaSE, lower.tail=TRUE ) # Next, the test of H0_B: true_beta < -thr pB <- pnorm( beta, -theta, betaSE, lower.tail=FALSE ) # The overall p value is the maximum, because we want to reject H0_A # and H0_B simultaneously pvalTOST <- pmax( pA, pB ) # Let's adjust our two p values with BH: sigDE <- p.adjust( pvalDE, "BH" ) < .1 sigSmall <- p.adjust( pvalTOST, "BH" ) < .1

4. the number of genes with no difference

{r} > sum(!is.na(sigSmall)) [1] 11836 #However, sigSmallv <- p.adjust( pvalTOST, "BH" ) min(na.omit(sigSmallv) [1] 0.301

If the smallest value is 0.301 then, it would not give 11836 significantly equivalent genes. Am I understanding it wrong ?

ADD REPLY
0
Entering edit mode

I am trying to put the code that was run :) running into formatting issues

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

This code goes back to a post from 6 years ago (2013), and since then we published the DESeq2 paper (2014) with a section about this type of analysis and also have an easy to use function which does this for you. It is highlighted in the vignette, look here:

https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#tests-of-log2-fold-change-above-or-below-a-threshold

For more details on what is going on, read the DESeq2 paper where it discusses LFC thresholds.

For the record, here is the 2012 post where the code comes from:

A: Testing for no change in RNA-seq data?

 

ADD COMMENT
0
Entering edit mode

Thank you Micheal, I did read it would be incorporated in the package, but didn't read the vignette carefully I guess. 

Am I correct in reading that "lessAbs - |β| < x - p values are the maximum of the upper and lower tests " is the replacement for the that code for TOST (test for equivalency)? 

 

ADD REPLY
0
Entering edit mode

This is answered in the DESeq2 paper, we describe the methods exactly (see Methods).

ADD REPLY
0
Entering edit mode

resLA <- results(dds, lfcThreshold=.5, altHypothesis="lessAbs")  sum(!is.na(resLA$padj) < 0.1)

can we say, this is the number of genes which are NOT different within logfoldchange of 0.5 ?

ADD REPLY
0
Entering edit mode

Yes. Less than in absolute value is finding the non DE genes.

ADD REPLY

Login before adding your answer.

Traffic: 1454 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6