From the DESeq2 vignette, the following snippet provides the example of how cqn offsets can be used in the analysis:
cqnOffset <- cqnObject$glm.offset
cqnNormFactors <- exp(cqnOffset)
Meanwhile, the EDAseq normfactors should be used after flipping their sign (note the -1):
EDASeqNormFactors <- exp(-1 * EDASeqOffset)
The same usage is echoed in the documentation of these tools.
However, from my understanding both cqn and EDASeq return offsets that are of the following form:
offset = log(normed_counts) - log(actual_counts)
See page 4 of cqn vignette (starting "the normalized expression values are", rearrange to move y (equiv. to log2(actual_counts)) left of equation to get the expression above), and see section 7.3 of EDASeq vignette which uses this expression as is.
I may be missing something very obvious here, but, given this, why multiply the EDASeq offset by -1? Or, equivalently, why not do that for cqn offsets?

But wouldn't the same be true for
cqnoffsets then? Why don't we flip those?From
cqnvignette:Which is equivalent to (as
$yshould belog(y_raw)andRPKM.cqnis equivalent tolog(y_norm))giving
which gives back
I don't use
offsetin the DESeq2 vignette. I useglm.offset.See the last section the cqn vignette.
Yes - this is the key I now realise.
The last section of the vignette does not really explain how exactly
glm.offsetis different, nor should be used instead ofoffset, so I had a look at the code:Note that
sweepfunction subtracts by default so the minus in the last term cancels out, givingSo there are three distinct steps happening in the
glm.offsetcalculation:- offset)+log2(size_factors /10^6)) (this statement was edited, see comments below).log(2) *)So this answers the question of why
EDASeqandcqnare treated differently inDESeq2, and the answer for that is thatcqndoes some extra work in the package to make it compatible,EDASeqdoesn't.Re: (2), I read the code and docs as adding in library size scaling, not removing.
This is because, after (1) the offset is in the direction of the bias, e.g. higher raw count, higher offset. So by adding log2 of size factor, this is going in the direction of the sequencing depth bias.
Also, the man page implies this as well:
Yes - I think you are right once more. Thanks for taking so much time to look into this.
Size factors come in twice in the code of
cqn: at the very beginning, and in the snippet I pasted before. At the beginning, they are subtracted away from the counts (the$yvariable is actually in counts per million). This makes the effective offset (from raw counts) to be equal tooffset - log2(size_factors/10^6). Taking the negative of this effective offset results into-offset + log2(size_factors/10^6)which is what we see in the equation above.I'll edit my previous answer to have the correct information in case somebody from the future is also wondering about this.