Question

sva: how to incorporate adjusting variables

1

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.3 years ago

Dear Jeffrey, I am using sva to estimate potential surrogate variables of a microarray derived expression dataset, as a previous step to perform differential gene expression analysis. The aim of my work is to study how one multifactorial variable ( inversion genotype, three categories -> STD,HET,INV ) affects the gene expression profile of a set of human individuals. However, there are some other variables ( population, gender ) with a partial effect, that is, they account for variation in the expression of a subset of genes. I don't know how to deal with these variables. Which of the following options is the most appropriate one (if any) ? A) "Protect" them by their inclusion in the both the null and and full model mod0 = model.matrix(~as.factor(Gender)+as.factor(Population), data=pheno) mod = model.matrix(~as.factor(inversion_genotype)+as.factor(Gender)+as .factor(Population), data=pheno) svobj = sva(edata,mod,mod0) B) Include them only in the full model mod0 = model.matrix(~1, data=pheno) mod = model.matrix(~as.factor(inversion_genotype)+as.factor(Gender)+as .factor(Population)+, data=pheno) svobj = sva(edata,mod,mod0) C) Not include them at all ( and expect to get some surrogate variables with strong correlation with these variables, in case they really affect gene expression ) mod0 = model.matrix(~1, data=pheno) mod = model.matrix(~as.factor(inversion_genotype), data=pheno) svobj = sva(edata,mod,mod0) To summarize: how should adjustment variables with global effect be treated? how should adjustment variables with partial effect ( only in a subset of genes ) be treated? I would really appreciate any piece of advice. Thanks a lot! Meri -- output of sessionInfo(): R version 2.15.2 (2012-10-26) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Sent via the guest posting facility at bioconductor.org.

inveRsion sva inveRsion sva • 2.3k views

ADD COMMENT • link 12.7 years ago Guest User ★ 13k