using offsets from EDASeq and cqn
1
0
Entering edit mode
love ▴ 150
@love-5173
Last seen 7.1 years ago
hi, I could have things totally mixed up here. I was thinking that GLM offsets would be positively correlated with log counts. for instance in edgeR, mglmLS.R: mu <- exp(beta %*% t(X) + offset) but it seems offsets from packages EDASeq and cqn are negative correlated with log counts. for example some poisson data: > n <- 1000 > covariate <- rnorm(n,6,.5) > counts <- replicate(2,rpois(n,exp(covariate))) > library(EDASeq) > edaseq.offset <- withinLaneNormalization(x=counts,y=covariate,which="full",offset=TRUE) > cor(edaseq.offset[,1],log(counts[,1]+1))  -0.9765923 > library(cqn) > cqn.offset <- cqn(counts,x=covariate,lengths=rep(1,n),lengthMethod="fixed",sizeFacto rs=c(1,1))$offset > cor(cqn.offset[,1],log(counts[,1]+1))  -0.9950717 So EDASeq and cqn are giving similar results here: > cor(edaseq.offset[,1],cqn.offset[,1])  0.9847672 If I give these to the standard glm function as offsets I would expect the coefficient for 'covariate' to go from 1 to 0. But it gives me something like a coefficient of 2, and to get zero I have to give it -1 * offset: without offset: > coef(glm(counts[,1] ~ covariate, family="poisson")) (Intercept) covariate -0.02735053 1.00467285 with offset: > coef(glm(counts[,1] ~ covariate, offset=edaseq.offset[,1], family="poisson")) (Intercept) covariate -5.860094 1.954997 > coef(glm(counts[,1] ~ covariate, offset=cqn.offset[,1], family="poisson")) (Intercept) covariate -8.752786 2.457637 with -1 * offset: > coef(glm(counts[,1] ~ covariate, offset=-edaseq.offset[,1], family="poisson")) (Intercept) covariate 5.8253511 0.0498681 > coef(glm(counts[,1] ~ covariate, offset=-cqn.offset[,1], family="poisson")) (Intercept) covariate 8.6975896 -0.4482184 thanks for any guidance, Mike GO edgeR cqn EDASeq GO edgeR cqn EDASeq • 1.2k views ADD COMMENT 0 Entering edit mode love ▴ 150 @love-5173 Last seen 7.1 years ago One more thing, another wrinkle is log or log2. For this reason, I shouldn't have used correlation in the example. I believe cqn is giving log2 offsets, EDASeq natural log, and edgeR is accepting natural log. thanks, Mike On 2/14/13 5:21 PM, Michael Love wrote: > hi, > > I could have things totally mixed up here. I was thinking that GLM > offsets would be positively correlated with log counts. > > for instance in edgeR, mglmLS.R: > > mu <- exp(beta %*% t(X) + offset) > > but it seems offsets from packages EDASeq and cqn are negative > correlated with log counts. > > for example some poisson data: > > > n <- 1000 > > covariate <- rnorm(n,6,.5) > > counts <- replicate(2,rpois(n,exp(covariate))) > > > library(EDASeq) > > edaseq.offset <- > withinLaneNormalization(x=counts,y=covariate,which="full",offset=TRUE) > > cor(edaseq.offset[,1],log(counts[,1]+1)) >  -0.9765923 > > > library(cqn) > > cqn.offset <- > cqn(counts,x=covariate,lengths=rep(1,n),lengthMethod="fixed",sizeFac tors=c(1,1))$offset > > cor(cqn.offset[,1],log(counts[,1]+1)) >  -0.9950717 > > > So EDASeq and cqn are giving similar results here: > > > cor(edaseq.offset[,1],cqn.offset[,1]) >  0.9847672 > > If I give these to the standard glm function as offsets I would expect > the coefficient for 'covariate' to go from 1 to 0. But it gives me > something like a coefficient of 2, and to get zero I have to give it > -1 * offset: > > without offset: > > > coef(glm(counts[,1] ~ covariate, family="poisson")) > (Intercept) covariate > -0.02735053 1.00467285 > > with offset: > > > coef(glm(counts[,1] ~ covariate, offset=edaseq.offset[,1], > family="poisson")) > (Intercept) covariate > -5.860094 1.954997 > > coef(glm(counts[,1] ~ covariate, offset=cqn.offset[,1], > family="poisson")) > (Intercept) covariate > -8.752786 2.457637 > > with -1 * offset: > > > coef(glm(counts[,1] ~ covariate, offset=-edaseq.offset[,1], > family="poisson")) > (Intercept) covariate > 5.8253511 0.0498681 > > coef(glm(counts[,1] ~ covariate, offset=-cqn.offset[,1], > family="poisson")) > (Intercept) covariate > 8.6975896 -0.4482184 > > > thanks for any guidance, > > Mike >
0
Entering edit mode
hi, re: EDASeq, this was my mistake. The vignette does in fact have the multiplication by -1, but my eyes scanned over this: > library(edgeR) > design <- model.matrix(~conditions, data=pData(dataOffset)) > disp <- estimateGLMCommonDisp(exprs(dataOffset), + design, offset=-offst(dataOffset)) > fit <- glmFit(exprs(dataOffset), design, disp, offset=-offst(dataOffset)) Then, I still believe that the cqn offsets should be multiplied by -log(2) to be handed to edgeR, which is not currently in the cqn vignette: > design <- model.matrix(~ d.mont$sample$group) > d.mont.cqn <- estimateGLMCommonDisp(d.mont, design = design, + offset = cqn.subset$offset) But I think it would be desirable for packages on both sides of the hand-off to provide an explicit formula for what is meant by "offset", which could be one of: log(mu) = X * beta + offset log2(mu) = X * beta + offset log(mu) + offset = X * beta log2(mu) + offset = X * beta best, Mike On 02/14/2013 06:53 PM, Mike Love wrote: > One more thing, another wrinkle is log or log2. For this reason, I > shouldn't have used correlation in the example. I believe cqn is > giving log2 offsets, EDASeq natural log, and edgeR is accepting > natural log. > > thanks, > > Mike > > On 2/14/13 5:21 PM, Michael Love wrote: >> hi, >> >> I could have things totally mixed up here. I was thinking that GLM >> offsets would be positively correlated with log counts. >> >> for instance in edgeR, mglmLS.R: >> >> mu <- exp(beta %*% t(X) + offset) >> >> but it seems offsets from packages EDASeq and cqn are negative >> correlated with log counts. >> >> for example some poisson data: >> >> > n <- 1000 >> > covariate <- rnorm(n,6,.5) >> > counts <- replicate(2,rpois(n,exp(covariate))) >> >> > library(EDASeq) >> > edaseq.offset <- >> withinLaneNormalization(x=counts,y=covariate,which="full",offset=TRUE) >> > cor(edaseq.offset[,1],log(counts[,1]+1)) >>  -0.9765923 >> >> > library(cqn) >> > cqn.offset <- >> cqn(counts,x=covariate,lengths=rep(1,n),lengthMethod="fixed",sizeFa ctors=c(1,1))$offset >> > cor(cqn.offset[,1],log(counts[,1]+1)) >>  -0.9950717 >> >> >> So EDASeq and cqn are giving similar results here: >> >> > cor(edaseq.offset[,1],cqn.offset[,1]) >>  0.9847672 >> >> If I give these to the standard glm function as offsets I would >> expect the coefficient for 'covariate' to go from 1 to 0. But it >> gives me something like a coefficient of 2, and to get zero I have to >> give it -1 * offset: >> >> without offset: >> >> > coef(glm(counts[,1] ~ covariate, family="poisson")) >> (Intercept) covariate >> -0.02735053 1.00467285 >> >> with offset: >> >> > coef(glm(counts[,1] ~ covariate, offset=edaseq.offset[,1], >> family="poisson")) >> (Intercept) covariate >> -5.860094 1.954997 >> > coef(glm(counts[,1] ~ covariate, offset=cqn.offset[,1], >> family="poisson")) >> (Intercept) covariate >> -8.752786 2.457637 >> >> with -1 * offset: >> >> > coef(glm(counts[,1] ~ covariate, offset=-edaseq.offset[,1], >> family="poisson")) >> (Intercept) covariate >> 5.8253511 0.0498681 >> > coef(glm(counts[,1] ~ covariate, offset=-cqn.offset[,1], >> family="poisson")) >> (Intercept) covariate >> 8.6975896 -0.4482184 >> >> >> thanks for any guidance, >> >> Mike >> > [[alternative HTML version deleted]]