Hi all,
I specifically have davide risso in mind for this question, but anyone who can help me is welcome. I've been thinking about this particular line from the RUVseq paper:
"Properly behaved spike-ins could be a valuable resource for normalization: by design, their read counts are expected to be constant (or to have known fold-changes) between samples and hence any deviations from nominal fold-changes should reflect nuisance technical effects. "
Now, consider a scenario where same exact amount of spike-in is added to to two samples A and B, but sample B had a global shift in gene expression downwards. Since the spike-in/sample mRNA ratio would be higher in sample B, I'd expect a higher percentage of reads mapping to spike-ins in sample B compared to sample A. My question is if I tried to normalize with RUVg in that case, would the method try to "normalize away" the difference in read counts as unwanted variation, and hence hide the shift in global expression? It seems to me that changes in spike-in/sample mRNA ratio, which will affect the read counts, would violate the above quoted assumption. So if one suspects a global shift, one should always add the spike-in proportional to sample mRNA amount.