using offsets from EDASeq and cqn
1
0
Entering edit mode
love ▴ 150
@love-5173
Last seen 10.2 years ago
hi, I could have things totally mixed up here. I was thinking that GLM offsets would be positively correlated with log counts. for instance in edgeR, mglmLS.R: mu <- exp(beta %*% t(X) + offset) but it seems offsets from packages EDASeq and cqn are negative correlated with log counts. for example some poisson data: > n <- 1000 > covariate <- rnorm(n,6,.5) > counts <- replicate(2,rpois(n,exp(covariate))) > library(EDASeq) > edaseq.offset <- withinLaneNormalization(x=counts,y=covariate,which="full",offset=TRUE) > cor(edaseq.offset[,1],log(counts[,1]+1)) [1] -0.9765923 > library(cqn) > cqn.offset <- cqn(counts,x=covariate,lengths=rep(1,n),lengthMethod="fixed",sizeFacto rs=c(1,1))$offset > cor(cqn.offset[,1],log(counts[,1]+1)) [1] -0.9950717 So EDASeq and cqn are giving similar results here: > cor(edaseq.offset[,1],cqn.offset[,1]) [1] 0.9847672 If I give these to the standard glm function as offsets I would expect the coefficient for 'covariate' to go from 1 to 0. But it gives me something like a coefficient of 2, and to get zero I have to give it -1 * offset: without offset: > coef(glm(counts[,1] ~ covariate, family="poisson")) (Intercept) covariate -0.02735053 1.00467285 with offset: > coef(glm(counts[,1] ~ covariate, offset=edaseq.offset[,1], family="poisson")) (Intercept) covariate -5.860094 1.954997 > coef(glm(counts[,1] ~ covariate, offset=cqn.offset[,1], family="poisson")) (Intercept) covariate -8.752786 2.457637 with -1 * offset: > coef(glm(counts[,1] ~ covariate, offset=-edaseq.offset[,1], family="poisson")) (Intercept) covariate 5.8253511 0.0498681 > coef(glm(counts[,1] ~ covariate, offset=-cqn.offset[,1], family="poisson")) (Intercept) covariate 8.6975896 -0.4482184 thanks for any guidance, Mike
GO edgeR cqn EDASeq GO edgeR cqn EDASeq • 2.1k views
ADD COMMENT
0
Entering edit mode
love ▴ 150
@love-5173
Last seen 10.2 years ago
One more thing, another wrinkle is log or log2. For this reason, I shouldn't have used correlation in the example. I believe cqn is giving log2 offsets, EDASeq natural log, and edgeR is accepting natural log. thanks, Mike On 2/14/13 5:21 PM, Michael Love wrote: > hi, > > I could have things totally mixed up here. I was thinking that GLM > offsets would be positively correlated with log counts. > > for instance in edgeR, mglmLS.R: > > mu <- exp(beta %*% t(X) + offset) > > but it seems offsets from packages EDASeq and cqn are negative > correlated with log counts. > > for example some poisson data: > > > n <- 1000 > > covariate <- rnorm(n,6,.5) > > counts <- replicate(2,rpois(n,exp(covariate))) > > > library(EDASeq) > > edaseq.offset <- > withinLaneNormalization(x=counts,y=covariate,which="full",offset=TRUE) > > cor(edaseq.offset[,1],log(counts[,1]+1)) > [1] -0.9765923 > > > library(cqn) > > cqn.offset <- > cqn(counts,x=covariate,lengths=rep(1,n),lengthMethod="fixed",sizeFac tors=c(1,1))$offset > > cor(cqn.offset[,1],log(counts[,1]+1)) > [1] -0.9950717 > > > So EDASeq and cqn are giving similar results here: > > > cor(edaseq.offset[,1],cqn.offset[,1]) > [1] 0.9847672 > > If I give these to the standard glm function as offsets I would expect > the coefficient for 'covariate' to go from 1 to 0. But it gives me > something like a coefficient of 2, and to get zero I have to give it > -1 * offset: > > without offset: > > > coef(glm(counts[,1] ~ covariate, family="poisson")) > (Intercept) covariate > -0.02735053 1.00467285 > > with offset: > > > coef(glm(counts[,1] ~ covariate, offset=edaseq.offset[,1], > family="poisson")) > (Intercept) covariate > -5.860094 1.954997 > > coef(glm(counts[,1] ~ covariate, offset=cqn.offset[,1], > family="poisson")) > (Intercept) covariate > -8.752786 2.457637 > > with -1 * offset: > > > coef(glm(counts[,1] ~ covariate, offset=-edaseq.offset[,1], > family="poisson")) > (Intercept) covariate > 5.8253511 0.0498681 > > coef(glm(counts[,1] ~ covariate, offset=-cqn.offset[,1], > family="poisson")) > (Intercept) covariate > 8.6975896 -0.4482184 > > > thanks for any guidance, > > Mike >
ADD COMMENT
0
Entering edit mode
hi, re: EDASeq, this was my mistake. The vignette does in fact have the multiplication by -1, but my eyes scanned over this: > library(edgeR) > design <- model.matrix(~conditions, data=pData(dataOffset)) > disp <- estimateGLMCommonDisp(exprs(dataOffset), + design, offset=-offst(dataOffset)) > fit <- glmFit(exprs(dataOffset), design, disp, offset=-offst(dataOffset)) Then, I still believe that the cqn offsets should be multiplied by -log(2) to be handed to edgeR, which is not currently in the cqn vignette: > design <- model.matrix(~ d.mont$sample$group) > d.mont.cqn <- estimateGLMCommonDisp(d.mont, design = design, + offset = cqn.subset$offset) But I think it would be desirable for packages on both sides of the hand-off to provide an explicit formula for what is meant by "offset", which could be one of: log(mu) = X * beta + offset log2(mu) = X * beta + offset log(mu) + offset = X * beta log2(mu) + offset = X * beta best, Mike On 02/14/2013 06:53 PM, Mike Love wrote: > One more thing, another wrinkle is log or log2. For this reason, I > shouldn't have used correlation in the example. I believe cqn is > giving log2 offsets, EDASeq natural log, and edgeR is accepting > natural log. > > thanks, > > Mike > > On 2/14/13 5:21 PM, Michael Love wrote: >> hi, >> >> I could have things totally mixed up here. I was thinking that GLM >> offsets would be positively correlated with log counts. >> >> for instance in edgeR, mglmLS.R: >> >> mu <- exp(beta %*% t(X) + offset) >> >> but it seems offsets from packages EDASeq and cqn are negative >> correlated with log counts. >> >> for example some poisson data: >> >> > n <- 1000 >> > covariate <- rnorm(n,6,.5) >> > counts <- replicate(2,rpois(n,exp(covariate))) >> >> > library(EDASeq) >> > edaseq.offset <- >> withinLaneNormalization(x=counts,y=covariate,which="full",offset=TRUE) >> > cor(edaseq.offset[,1],log(counts[,1]+1)) >> [1] -0.9765923 >> >> > library(cqn) >> > cqn.offset <- >> cqn(counts,x=covariate,lengths=rep(1,n),lengthMethod="fixed",sizeFa ctors=c(1,1))$offset >> > cor(cqn.offset[,1],log(counts[,1]+1)) >> [1] -0.9950717 >> >> >> So EDASeq and cqn are giving similar results here: >> >> > cor(edaseq.offset[,1],cqn.offset[,1]) >> [1] 0.9847672 >> >> If I give these to the standard glm function as offsets I would >> expect the coefficient for 'covariate' to go from 1 to 0. But it >> gives me something like a coefficient of 2, and to get zero I have to >> give it -1 * offset: >> >> without offset: >> >> > coef(glm(counts[,1] ~ covariate, family="poisson")) >> (Intercept) covariate >> -0.02735053 1.00467285 >> >> with offset: >> >> > coef(glm(counts[,1] ~ covariate, offset=edaseq.offset[,1], >> family="poisson")) >> (Intercept) covariate >> -5.860094 1.954997 >> > coef(glm(counts[,1] ~ covariate, offset=cqn.offset[,1], >> family="poisson")) >> (Intercept) covariate >> -8.752786 2.457637 >> >> with -1 * offset: >> >> > coef(glm(counts[,1] ~ covariate, offset=-edaseq.offset[,1], >> family="poisson")) >> (Intercept) covariate >> 5.8253511 0.0498681 >> > coef(glm(counts[,1] ~ covariate, offset=-cqn.offset[,1], >> family="poisson")) >> (Intercept) covariate >> 8.6975896 -0.4482184 >> >> >> thanks for any guidance, >> >> Mike >> > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 1016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6