From the `DESeq2`

vignette, the following snippet provides the example of how `cqn`

offsets can be used in the analysis:

```
cqnOffset <- cqnObject$glm.offset
cqnNormFactors <- exp(cqnOffset)
```

Meanwhile, the EDAseq normfactors should be used after flipping their sign (note the `-1`

):

```
EDASeqNormFactors <- exp(-1 * EDASeqOffset)
```

The same usage is echoed in the documentation of these tools.
However, from my understanding both `cqn`

and `EDASeq`

return offsets that are of the following form:

```
offset = log(normed_counts) - log(actual_counts)
```

See page 4 of cqn vignette (starting "the normalized expression values are", rearrange to move `y`

(equiv. to log2(actual_counts)) left of equation to get the expression above), and see section 7.3 of `EDASeq`

vignette which uses this expression as is.

I may be missing something very obvious here, but, given this, why multiply the `EDASeq`

offset by -1? Or, equivalently, why not do that for `cqn`

offsets?

But wouldn't the same be true for

`cqn`

offsets then? Why don't we flip those?From

`cqn`

vignette:Which is equivalent to (as

`$y`

should be`log(y_raw)`

and`RPKM.cqn`

is equivalent to`log(y_norm)`

)giving

which gives back

I don't use

`offset`

in the DESeq2 vignette. I use`glm.offset`

.See the last section the cqn vignette.

Yes - this is the key I now realise.

The last section of the vignette does not really explain how exactly

`glm.offset`

is different, nor should be used instead of`offset`

, so I had a look at the code:Note that

`sweep`

function subtracts by default so the minus in the last term cancels out, givingSo there are three distinct steps happening in the

`glm.offset`

calculation:`- offset`

)`+log2(size_factors /10^6)`

) (this statement was edited, see comments below).`log(2) *`

)So this answers the question of why

`EDASeq`

and`cqn`

are treated differently in`DESeq2`

, and the answer for that is that`cqn`

does some extra work in the package to make it compatible,`EDASeq`

doesn't.Re: (2), I read the code and docs as adding in library size scaling, not removing.

This is because, after (1) the offset is in the direction of the bias, e.g. higher raw count, higher offset. So by adding log2 of size factor, this is going in the direction of the sequencing depth bias.

Also, the man page implies this as well:

Yes - I think you are right once more. Thanks for taking so much time to look into this.

Size factors come in twice in the code of

`cqn`

: at the very beginning, and in the snippet I pasted before. At the beginning, they are subtracted away from the counts (the`$y`

variable is actually in counts per million). This makes the effective offset (from raw counts) to be equal to`offset - log2(size_factors/10^6)`

. Taking the negative of this effective offset results into`-offset + log2(size_factors/10^6)`

which is what we see in the equation above.I'll edit my previous answer to have the correct information in case somebody from the future is also wondering about this.