Hi all,
I have trouble understanding the results of my deseq()
command, espacially the calculated size.factors().
This is the list of my library sizes (I have shortened it for better overview):
> colSums(counts(dds)) A 24593612 B 24477676 C 25137143 D 23676295 E 23581553 ... Q2 18092067 R 19495619 R2 3808119 ... W 23762686 X 25669615
My question regards sample R2. This is a very small library, So I have expected it to have a very large size.factor() when compared to libraries almost 10fold larger.But the size factor I get here is very small.
> sizeFactors(dds) A 1.0167371 B 0.9574096 C 1.0823689 D 0.9329557 E 0.9519349 F 1.0187297 ... Q2 0.8638798 R 0.9388412 R2 0.1831432 ... W 1.2248133 X 1.2921096
Can someone please explain to me why this is happening. I always thought that a size.factor = 1 would mean that the library size is equal to the calculated " reference genome", but if a library is smaller, the size factor will be higher than 1.
thanks a lot in advance
Assa
Thanks Steve for the answer. I meant reference sample deseq is creating to calculate the size factors, sorry for the misspelling.
I have expected that R2 would be an outlier in my data set. i just didn't expect it to be so much smaller than the other samples. I would have thought it would be a higher size factor.