Limma with paired samples and SVA
Entering edit mode
wliao • 0
Last seen 4.1 years ago

I have an experimental design as follows:


Patient Disease Treatment
1 Unaffected Treated
1 Unaffected Untreated
2 Unaffected Treated
2 Unaffected Untreated
3 Unaffected Treated
3 Unaffected Untreated
4 Affected Treated
4 Affected Untreated
5 Affected Treated
5 Affected Untreated
6 Affected Treated
6 Affected Untreated

We'd like to test:

1) Untreated, Affected vs. Unaffected

2) Unaffected-only, Treated vs. Untreated

3) Affected-only, Treated vs. Untreated

We are trying to use SVA to also control for cell composition and the include the SV's in the model.

mod <- model.matrix(~0+Treatment:Disease, data=pd)
mod0 <- model.matrix(~1, data=pd)
svobj <- sva(M, mod, mod0)

design <- model.matrix(data=pd, ~0+Treatment:Disease+svobj$sv)

colnames(design) <- c(paste0('SV',seq(ncol(svobj$sv))),'Untreat_Unaff','Treat_Unaff','Untreat_Aff','Treat_Aff')
contrast.matrix <- makeContrasts('Treat_Aff-Untreat_Aff',

Then we'd like to account for pairing of patient samples without and with treatment using duplicateCorrelation().

dupCor <- duplicateCorrelation(M, design, block=factor(pd$Patient))

But we get this error:

Warning message in atanh(pmax(-1, rho)):
"NaNs produced"

It works when I don't add the SV's to the model so just confused as to what's going on.

design <- model.matrix(data=pd, ~0+Treatment:Disease)


Is there something I'm not understanding about the design matrix, problem there causing this problem? Not at all confident we've set this up correctly.

limma sva duplicatecorrelation estimatecellcounts • 875 views
Entering edit mode
Aaron Lun ★ 27k
Last seen 11 hours ago
The city by the bay

I daresay that at least some of the surrogate variables represent the patient effect, given that you didn't tell sva about the patient factor. This may be responsible for funny behaviour in duplicateCorrelation, as the component of variance explained by the blocking factor is not well-defined when the blocking factor is effectively in the design matrix already.

Probably the simplest solution is to tell sva about the patient effect so that it doesn't try to detect it as an unknown factor of variation. This can be done by respecifying mod:

pd$Patient <- factor(pd$Patient)
mod <- model.matrix(~0+Patient+Treatment:Disease, data=pd)
mod <- mod[,!grepl("Untreated", colnames(mod))] # get to full rank

... to ensure that the patient effect is present. Hopefully this will encourage sva to look for other surrogate variables that are not specified in the design matrix. Note that the treatment effect is implicitly modeled - there is no need for an explicit Treatment column, as that column is directly made up of the corresponding patient effects.

Of course, there is no guarantee that the SVs will correspond to cell composition. If you have a priori information about differences in cell composition between samples, you should consider using that information directly rather than hoping that sva will discover it for you.


Login before adding your answer.

Traffic: 432 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6