Question

edgeR glm

0

Entering edit mode

upendra kumar devisetty ▴ 30

@upendra-kumar-devisetty-5614

Last seen 11.4 years ago

Hi Gordon, For my experiment, I have three levels of factors for genotype ("B2, "Ein40","Ein9") and two levels of factors as treatment ("Sun", "Shade"). I did a nested interaction method to detect the differential gene expression. Basically i am interested in the following: 1. Those genes that are differentially expressed among all three genotypes. 2. Those genes that are differentially expressed between wild-type (B2) and two mutants(Ein40 & Ein9) for Shade treatment. Here is my code for the above: library(edgeR) library(reshape) counts <- read.delim("sam2countsResults.tsv",row.names=NULL) head(counts) counts<-counts[counts$gene!="*",] row.names(counts) <- counts$gene counts <- counts[,-1] head(counts) counts[is.na(counts)] <- 0 names(counts) summary(counts) sort(colSums(counts)) counts <- counts[,colSums(counts) > 100000] names(counts) dim(counts) samples <- data.frame( file=names(counts), gt=unlist(strsplit(names(counts),"[_\\.]"))[c(1,7,13,19,25,31,37,43,49 ,55,61,67,73,79,85,91,97,103)], trt=unlist(strsplit(names(counts),"[_\\.]"))[c(3,9,15,21,27,33,39,45,5 1,57,63,69,75,81,87,93,99,105)], rep=unlist(strsplit(names(counts),"[_\\.]"))[c(4,10,16,22,28,34,40,46, 52,58,64,70,76,82,88,94,100,106)] ) data.int <- DGEList(counts,group=group) data.int$samples data.int <- data.int[rowSumscpmdata.int) > 1) >= 3,] dimdata.int) data.int <- calcNormFactorsdata.int) design.int <- model.matrix(~gt*trt) rownamesdesign.int) <- colnamesdata.int) design.int data.int <- estimateGLMCommonDispdata.int,design.int) data.int <- estimateGLMTrendedDispdata.int,design.int) fit.int <- glmFitdata.int,design.int) colnamesfit.int$design) > colnamesfit.int$design) [1] "(Intercept)" "gtein40" "gtein9" "trtSun" "gtein40:trtSun" [6] "gtein9:trtSun" For my objective 1 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=2:3) topTags(int.lrt) and for my objective 2 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=5:6) topTags(int.lrt) Am i doing correct? Thanks in advance for your help. Upendra ------------------------------------------------- Upendra Kumar Devisetty, Ph.D Postdoctoral Scholar Dept. of Plant Biology, UC Davis Davis, CA 95616 [[alternative HTML version deleted]]

• 1.2k views

ADD COMMENT • link updated 13.2 years ago by Gordon Smyth 53k • written 13.2 years ago by upendra kumar devisetty ▴ 30

score 0 · Answer 1 · 2012-11-20

Dear Upendra, Statistical theory warns strongly against testing for main effects in a linear model when interactions are present, because it is too difficult to interpret exactly what you are testing. This what you are doing with: glmLRTdata.int,fit.int,coef=2:3) Unless you are very familiar and confident with interaction model formula, you may be better off following the advice in Chapter 3 of the edgeR User's Guide: http://bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc /edgeRUsersGuide.pdf Best wishes Gordon PS. Are you working with the current release of edgeR and Bioconductor? ------------ original message ----------- [BioC] edgeR glm upendra kumar devisetty udevisetty at ucdavis.edu Sun Nov 18 16:36:02 CET 2012 Hi Gordon, For my experiment, I have three levels of factors for genotype ("B2, "Ein40","Ein9") and two levels of factors as treatment ("Sun", "Shade"). I did a nested interaction method to detect the differential gene expression. Basically i am interested in the following: 1. Those genes that are differentially expressed among all three genotypes. 2. Those genes that are differentially expressed between wild-type (B2) and two mutants(Ein40 & Ein9) for Shade treatment. Here is my code for the above: library(edgeR) library(reshape) counts <- read.delim("sam2countsResults.tsv",row.names=NULL) head(counts) counts<-counts[counts$gene!="*",] row.names(counts) <- counts$gene counts <- counts[,-1] head(counts) counts[is.na(counts)] <- 0 names(counts) summary(counts) sort(colSums(counts)) counts <- counts[,colSums(counts) > 100000] names(counts) dim(counts) samples <- data.frame( file=names(counts), gt=unlist(strsplit(names(counts),"[_\\.]"))[c(1,7,13,19,25,31,37,43,49 ,55,61,67,73,79,85,91,97,103)], trt=unlist(strsplit(names(counts),"[_\\.]"))[c(3,9,15,21,27,33,39,45,5 1,57,63,69,75,81,87,93,99,105)], rep=unlist(strsplit(names(counts),"[_\\.]"))[c(4,10,16,22,28,34,40,46, 52,58,64,70,76,82,88,94,100,106)] ) data.int <- DGEList(counts,group=group) data.int$samples data.int <- data.int[rowSumscpmdata.int) > 1) >= 3,] dimdata.int) data.int <- calcNormFactorsdata.int) design.int <- model.matrix(~gt*trt) rownamesdesign.int) <- colnamesdata.int) design.int data.int <- estimateGLMCommonDispdata.int,design.int) data.int <- estimateGLMTrendedDispdata.int,design.int) fit.int <- glmFitdata.int,design.int) colnamesfit.int$design) > colnamesfit.int$design) [1] "(Intercept)" "gtein40" "gtein9" "trtSun" "gtein40:trtSun" [6] "gtein9:trtSun" For my objective 1 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=2:3) topTags(int.lrt) and for my objective 2 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=5:6) topTags(int.lrt) Am i doing correct? Thanks in advance for your help. Upendra ------------------------------------------------- Upendra Kumar Devisetty, Ph.D Postdoctoral Scholar Dept. of Plant Biology, UC Davis Davis, CA 95616 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}