Question: paired sample - SVA model matrix
gravatar for flippy23
10 months ago by
flippy230 wrote:



I am running SVA and had a question regarding the null and full model matrices. I am comparing a before/after effect of a treatment within each patient. How do I account for this in building the model matrices? How do I specify to SVA that the before/after variable is within the same patient? 



limma sva paired sample • 310 views
ADD COMMENTlink modified 10 months ago by James W. MacDonald51k • written 10 months ago by flippy230
Answer: paired sample - SVA model matrix
gravatar for James W. MacDonald
10 months ago by
United States
James W. MacDonald51k wrote:

You can find an example of a paired analysis in the limma User's Guide, section 9.4.1

For sva you will want to define both the full and reduced model by hand though, as the default reduced model has only an intercept, which isn't the right reduced model for your situation.

ADD COMMENTlink written 10 months ago by James W. MacDonald51k

So...just to make sure I understand.

null model (only adjustment covariates) would consist of: covariate 1 + covariate 2 + covariate 3 

full model: ~Study_ID+Treatment + covariate 1 + covariate 2 + covariate 3 


how will it differentiate the treatment effect from the other covariates? 


ADD REPLYlink written 10 months ago by flippy230

It doesn't. A statistician would call the three covariates 'nuisance variables', which are things that you think (know) will have an effect on at least some of the genes, and so you need to account for them in your model.

In your case you said you have paired samples. So if one patient has a much higher level of expression for a given gene than the other patients, you don't necessarily care about that fact. Instead, you want to know how much the treatment affects the gene expression after accounting for any patient-specific differences. And you account for any between-patient differences by adding a patient-level blocking factor to your model, which estimates the mean expression for that patient, and removes that patient-level expression, to give you a cleaner signal.

Put another way, this is just simple algebra. You are saying, that for Gene X, the expression for patient 1 is

Gene_X_expression = patient_effect + treatment_effect + covariate1_effect + covariate2_effect + covariate3_effect

Which is the same as saying that

treatment_effect = Gene_X_expression - patient_effect - covariate1_effect - covariate2_effect - covariate3_effect

So the treatment effect is estimated from the expression data, after subtracting out all these other things that might affect the gene expression, but are not of interest to you. It's a little more complicated than that, but that's the basic idea.

ADD REPLYlink written 10 months ago by James W. MacDonald51k

Or maybe you are asking a different question? You might be asking how does sva differentiate the treatment covariate from the others? Again, it doesn't.

The idea behind sva is to say that you might have other nuisance variables lurking in your data that you don't really know about. In which case you want to have the data after you have adjusted out all the known nuisance variables (your reduced model), as well as the data after you have adjusted out all the known nuisance variables plus the variable you care about. You then try to see if there are any extra patterns in the data (after adjusting for all the variables you know about) that you can include as extra nuisance variables, but with the caveat that you don't want to erase anything that might pertain to the treatment.


ADD REPLYlink written 10 months ago by James W. MacDonald51k

I have a follow-up question based on this. In downstream analysis, subject ID (two samples - pre/post) is the random effect group. Because I'm controlling for patient specific effects in downstream analysis, I guess the within-individual technical variability is now of interest - as that may be correlated with treatment effect in the pre/post within a patient. In this case, using a patient blocking factor for SVA wouldn't consider the pre/post differences within a patient, and may contribute to "noisier" data downstream?

ADD REPLYlink modified 5 months ago • written 5 months ago by flippy230


How did you finally designed the full and null model? Did you incorporate Study_ID in both the full and null model or only in the full model?

mod = ~ Subject_ID + Treatment

mod0 = ~ Subject_ID


mod = ~ Subject_ID + Treatment

mod0 = ~1

ADD REPLYlink modified 18 days ago • written 18 days ago by sara.blocquiaux0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour