SVA for repeated measures design
2
0
Entering edit mode
@sarablocquiaux-21717
Last seen 4.3 years ago

Hi all,

I have RNA-seq data (5 subjects are measured on 4 time points) and would like to do a SVA first to be able to include potential confounders into the statistical model (Deseq2 pipeline).

I am having troubles how to define my null and full model in the SVA:

Full model ~ TIME + SUBJECT.ID

Null model ~ SUBJECT.ID OR Null model ~ 1

Should the subjects ID be treated as a factor of interest or as a confounding factor?

Thanks in advance!

Best,

Sara

SVA deseq2 repeated measures • 1.4k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

You should probably use just an intercept for your null model. In general, if you have repeated measures (which I assume you do, given the subject ID), AND given that you have complete repeated measures (where you have measurements from each subject at each time), then the subject-specific changes are orthogonal to the measure of interest, and blocking on subject is the way to go. It also makes it easier to interpret your coefficients.

Put a different way, sva is intended to generate surrogate variables for unobserved variability. The subjects are by definition observed, so if you wanted to use the sva package to do something with them, you could consider them to be batch effects and use ComBat (note that I am not advocating this, but just noting that sva is for things you don't observe and ComBat is for things you know about.)

ADD COMMENT
0
Entering edit mode

Thanks James.

Yes, I have repeated measures. Subject.ID is not my factor of interest, but of course I want to include it in the model. For one subject, I have two missing time points though.

I do not want ComBat to correct for Subject.ID, but rather want SVAseq to find confounding factors (other than Time and Subject.ID). The design model I intend to use in deseq is: ~Subject + SV1 + ... + Time.

I will use the null model ~1 in SVAseq, as suggested.

ADD REPLY
0
Entering edit mode
Robert Castelo ★ 3.2k
@rcastelo
Last seen 4 days ago
Barcelona/Universitat Pompeu Fabra

Hi,

I would say the answer is to include SUBJECT.ID in the null model because, as argued by Jeff Leek, author of SVA, in this thread about a similar design case, SUBJECT.ID will be used in the ultimate linear model you intend to fit to test for the effect of your variable of interest.

cheers,

robert.

ADD COMMENT
0
Entering edit mode

I agree, that was what I was thinking at first.

But subject.ID is not just a covariate, it is a random factor. So it is still not clear to me whether to include it in the null model or not. Not including it in the null model, makes it kind of a variable of interest itself.

ADD REPLY
1
Entering edit mode

If SUBJECT.ID is a random factor, then you should not put it into the design matrix and use duplicateCorrelation() and the arguments correlation and block in the call to lmFit(); see section on Multi-level experiments from the limma User's Guide. If you don't need surrogate variables, then you can just follow that documentation.

The complication comes when you want to combine it with surrogate variables estimated with SVA. You can try to have a full model with TIME only and the null with the intercept. Then, estimate surrogate variables, paste them into the design matrix and proceed with the duplicateCorrelation() blocking on SUBJECT.ID. However, it may happen that SVA has already estimated part of the SUBJECT.ID variablity and this may lead to problems with duplicateCorrelation(); see this thread about that possibility. So, I'd suggest to include SUBJECT.ID in the full and null models that you give to SVA (next to TIME), just to ensure that the SUBJECT.ID variability is not picked up by SVA. Then, place TIME and the surrogate variables in a new design matrix, i.e., without SUBJECT.ID, and proceed with duplicateCorrelation() blocking on SUBJECT.ID.

ADD REPLY

Login before adding your answer.

Traffic: 739 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6