Search
Question: Adjusted P values
0
gravatar for kaushal Raj Chaudhary
3.0 years ago by
United States
kaushal Raj Chaudhary10 wrote:

Hello Forum,

I am doing gene expression analysis using limma package.  After reducing the number of tests through filtering of genes, I am getting higher adjusted p values.  Just curious, what is causing this, probably no significant differences in signals.

I appreciate any insight.

Thanks,

R code:

setwd("C:/Users/chaudhak/Desktop/Michelle_celfiels/Heart/CDCB vs HFDM")
getwd()
library(affy)
Mydata<-ReadAffy()   ### reading in the celfiles 
Mydata

phenoData(eset)                 ### phenotypic data
pData(eset)$case=c("CD.ND","CD.ND","CD.ND","CD.ND","CD.ND","CD.ND","CD.ND","CD.ND",
                   "HF.DM","HF.DM","HF.DM","HF.DM","HF.DM")      ### phenotype dataframe
pData(eset)

library(limma)
Group<- as.factor(pData(eset)[,2])
design<-model.matrix(~0+Group)
colnames(design)<-c("CD.DM","HF.DM")
contrast.matrix<-makeContrasts(
                   CD.DM VS HF.DM=(CD.DM-HF.DM),
                   levels=design)
fit <- lmFit(eset, design)  
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
colnames(fit2)

topTable(fit2,coef=1,adjust="fdr")  
ADD COMMENTlink modified 2.9 years ago by Gordon Smyth32k • written 3.0 years ago by kaushal Raj Chaudhary10
3
gravatar for James W. MacDonald
3.0 years ago by
United States
James W. MacDonald44k wrote:

You don't show at which step you are filtering, nor how you are filtering, so you are neglecting to give us the only pieces of information that would be useful. If I were to hazard a guess, I would think you are filtering your eset object, rather than after the eBayes() step, which is the likely culprit. I would also bet that you are filtering using some method that results in the remaining genes having a higher variance than before you filtered.

Note that the eBayes() step estimates a prior for your variance estimate, using all the genes in your MArrayLM object. If you filter your genes first, and especially if you use a method that is based on selecting genes with higher variance, then the prior you compute will be inflated because you only have higher variance genes remaining. This will artificially inflate the denominator of your t-statistic, causing you to compute larger p-values, and hence larger adjusted p-values.

You can check this by looking at fit2$s2.prior for both instances (with and without filtering).

ADD COMMENTlink written 3.0 years ago by James W. MacDonald44k

Hi Jim,

I am filtering the genes at eset object.

source("http://bioconductor.org/biocLite.R")
biocLite("ragene10sttranscriptcluster.db")  
biocLite("ALL")
library(ragene10sttranscriptcluster.db)
library(genefilter)

annotation(eset) <- "ragene10sttranscriptcluster.db"

celfiles.filt<- nsFilter(eset,require.entrez=TRUE, var.cutoff =0.5)$eset

celfiles.filt$filter.log

dim(celfiles.filt)

mat1<-exprs(celfiles.filt)
dim(mat1)
head(mat1)
sessionInfo()

I was wondering how to select genes after ebayes() step?

Thanks for your help.

ADD REPLYlink written 3.0 years ago by kaushal Raj Chaudhary10
2
gravatar for James W. MacDonald
3.0 years ago by
United States
James W. MacDonald44k wrote:

The MArrayLM object can be subsetted as if it were a data.frame (it's not, but there are methods for '[', so it acts as if it were).

filtered.probes <- featureNames(nsFilter(eset,require.entrez=TRUE, var.cutoff =0.5)$eset)
fit <- lmFit(eset, design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
fit2.filtered <- fit2[filtered.probes,]
topTable(fit2.filtered, 1)

 

ADD COMMENTlink written 3.0 years ago by James W. MacDonald44k

Thanks, Jim. 

ADD REPLYlink written 3.0 years ago by kaushal Raj Chaudhary10
2
gravatar for Gordon Smyth
3.0 years ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

The default behaviour of nsFilter with var.filter=TRUE is very dangerous. As James has already pointed out, using this filter before running eBayes produces bogus results. However it produces unpredictable results after eBayes as well.

You would get more gain in power by using trend=TRUE when you run eBayes() than any potential gain from filtering.

You could certainly run nsFilter() with require.entrez=TRUE, remove.dupEntrez=FALSE, var.filter=FALSE) if you want to restrict analysis to well annotated probe-sets. With those settings there would be some gain from running nsFilter() before eBayes.

ADD COMMENTlink written 3.0 years ago by Gordon Smyth32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 180 users visited in the last hour