sva: how to incorporate adjusting variables
0
1
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Dear Jeffrey, I am using sva to estimate potential surrogate variables of a microarray derived expression dataset, as a previous step to perform differential gene expression analysis. The aim of my work is to study how one multifactorial variable ( inversion genotype, three categories -> STD,HET,INV ) affects the gene expression profile of a set of human individuals. However, there are some other variables ( population, gender ) with a partial effect, that is, they account for variation in the expression of a subset of genes. I don't know how to deal with these variables. Which of the following options is the most appropriate one (if any) ? A) "Protect" them by their inclusion in the both the null and and full model mod0 = model.matrix(~as.factor(Gender)+as.factor(Population), data=pheno) mod = model.matrix(~as.factor(inversion_genotype)+as.factor(Gender)+as .factor(Population), data=pheno) svobj = sva(edata,mod,mod0) B) Include them only in the full model mod0 = model.matrix(~1, data=pheno) mod = model.matrix(~as.factor(inversion_genotype)+as.factor(Gender)+as .factor(Population)+, data=pheno) svobj = sva(edata,mod,mod0) C) Not include them at all ( and expect to get some surrogate variables with strong correlation with these variables, in case they really affect gene expression ) mod0 = model.matrix(~1, data=pheno) mod = model.matrix(~as.factor(inversion_genotype), data=pheno) svobj = sva(edata,mod,mod0) To summarize: how should adjustment variables with global effect be treated? how should adjustment variables with partial effect ( only in a subset of genes ) be treated? I would really appreciate any piece of advice. Thanks a lot! Meri -- output of sessionInfo(): R version 2.15.2 (2012-10-26) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Sent via the guest posting facility at bioconductor.org.
inveRsion sva inveRsion sva • 2.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 875 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6