edgeR glm
1
0
Entering edit mode
@upendra-kumar-devisetty-5614
Last seen 9.6 years ago
Hi Gordon, For my experiment, I have three levels of factors for genotype ("B2, "Ein40","Ein9") and two levels of factors as treatment ("Sun", "Shade"). I did a nested interaction method to detect the differential gene expression. Basically i am interested in the following: 1. Those genes that are differentially expressed among all three genotypes. 2. Those genes that are differentially expressed between wild-type (B2) and two mutants(Ein40 & Ein9) for Shade treatment. Here is my code for the above: library(edgeR) library(reshape) counts <- read.delim("sam2countsResults.tsv",row.names=NULL) head(counts) counts<-counts[counts$gene!="*",] row.names(counts) <- counts$gene counts <- counts[,-1] head(counts) counts[is.na(counts)] <- 0 names(counts) summary(counts) sort(colSums(counts)) counts <- counts[,colSums(counts) > 100000] names(counts) dim(counts) samples <- data.frame( file=names(counts), gt=unlist(strsplit(names(counts),"[_\\.]"))[c(1,7,13,19,25,31,37,43,49 ,55,61,67,73,79,85,91,97,103)], trt=unlist(strsplit(names(counts),"[_\\.]"))[c(3,9,15,21,27,33,39,45,5 1,57,63,69,75,81,87,93,99,105)], rep=unlist(strsplit(names(counts),"[_\\.]"))[c(4,10,16,22,28,34,40,46, 52,58,64,70,76,82,88,94,100,106)] ) data.int <- DGEList(counts,group=group) data.int$samples data.int <- data.int[rowSumscpmdata.int) > 1) >= 3,] dimdata.int) data.int <- calcNormFactorsdata.int) design.int <- model.matrix(~gt*trt) rownamesdesign.int) <- colnamesdata.int) design.int data.int <- estimateGLMCommonDispdata.int,design.int) data.int <- estimateGLMTrendedDispdata.int,design.int) fit.int <- glmFitdata.int,design.int) colnamesfit.int$design) > colnamesfit.int$design) [1] "(Intercept)" "gtein40" "gtein9" "trtSun" "gtein40:trtSun" [6] "gtein9:trtSun" For my objective 1 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=2:3) topTags(int.lrt) and for my objective 2 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=5:6) topTags(int.lrt) Am i doing correct? Thanks in advance for your help. Upendra ------------------------------------------------- Upendra Kumar Devisetty, Ph.D Postdoctoral Scholar Dept. of Plant Biology, UC Davis Davis, CA 95616 [[alternative HTML version deleted]]
• 1.0k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 13 minutes ago
WEHI, Melbourne, Australia
Dear Upendra, Statistical theory warns strongly against testing for main effects in a linear model when interactions are present, because it is too difficult to interpret exactly what you are testing. This what you are doing with: glmLRTdata.int,fit.int,coef=2:3) Unless you are very familiar and confident with interaction model formula, you may be better off following the advice in Chapter 3 of the edgeR User's Guide: http://bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc /edgeRUsersGuide.pdf Best wishes Gordon PS. Are you working with the current release of edgeR and Bioconductor? ------------ original message ----------- [BioC] edgeR glm upendra kumar devisetty udevisetty at ucdavis.edu Sun Nov 18 16:36:02 CET 2012 Hi Gordon, For my experiment, I have three levels of factors for genotype ("B2, "Ein40","Ein9") and two levels of factors as treatment ("Sun", "Shade"). I did a nested interaction method to detect the differential gene expression. Basically i am interested in the following: 1. Those genes that are differentially expressed among all three genotypes. 2. Those genes that are differentially expressed between wild-type (B2) and two mutants(Ein40 & Ein9) for Shade treatment. Here is my code for the above: library(edgeR) library(reshape) counts <- read.delim("sam2countsResults.tsv",row.names=NULL) head(counts) counts<-counts[counts$gene!="*",] row.names(counts) <- counts$gene counts <- counts[,-1] head(counts) counts[is.na(counts)] <- 0 names(counts) summary(counts) sort(colSums(counts)) counts <- counts[,colSums(counts) > 100000] names(counts) dim(counts) samples <- data.frame( file=names(counts), gt=unlist(strsplit(names(counts),"[_\\.]"))[c(1,7,13,19,25,31,37,43,49 ,55,61,67,73,79,85,91,97,103)], trt=unlist(strsplit(names(counts),"[_\\.]"))[c(3,9,15,21,27,33,39,45,5 1,57,63,69,75,81,87,93,99,105)], rep=unlist(strsplit(names(counts),"[_\\.]"))[c(4,10,16,22,28,34,40,46, 52,58,64,70,76,82,88,94,100,106)] ) data.int <- DGEList(counts,group=group) data.int$samples data.int <- data.int[rowSumscpmdata.int) > 1) >= 3,] dimdata.int) data.int <- calcNormFactorsdata.int) design.int <- model.matrix(~gt*trt) rownamesdesign.int) <- colnamesdata.int) design.int data.int <- estimateGLMCommonDispdata.int,design.int) data.int <- estimateGLMTrendedDispdata.int,design.int) fit.int <- glmFitdata.int,design.int) colnamesfit.int$design) > colnamesfit.int$design) [1] "(Intercept)" "gtein40" "gtein9" "trtSun" "gtein40:trtSun" [6] "gtein9:trtSun" For my objective 1 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=2:3) topTags(int.lrt) and for my objective 2 i am doing the following: int.lrt <- glmLRTdata.int,fit.int,coef=5:6) topTags(int.lrt) Am i doing correct? Thanks in advance for your help. Upendra ------------------------------------------------- Upendra Kumar Devisetty, Ph.D Postdoctoral Scholar Dept. of Plant Biology, UC Davis Davis, CA 95616 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6