Hi,
I am using following code to analyse two samples and differentially expressed gene in them. there are 3 replicates for each sample. after completing analysis I am seeing all the logFC values in negative, I don't know why? I actually went back and checked the raw count for them and they seem wrong. can you please tell me if I am doing anything wrong? also why pvalue is 0 (zero)? Thank you
here is the code that I am using -
#data_import
raw.data<-read.table("bcytvsgut.count",header=TRUE)
group<-c("Bcyt","Bcyt","Bcyt","mgut","mgut","mgut")
#make DGE list
cds<-DGEList(raw.data,group=group)
names(cds)
head(cds$counts)
cds$samples
#normalize data using TMM
cds <- calcNormFactors( cds )
d$samples
#data filtering
cps <- cpm(cds)
k <- rowSums(cps>=10)>3
d <- cds[k,]
#reset the library size
d$samples$lib.size <- colSums(d$counts)
cols <- as.numeric(d$samples$group)
par(mfrow=c(2,2))
plotMDS(d,col=cols)
#setup desing
design <- model.matrix(~0+group, data=d$samples)
design
d <- estimateGLMCommonDisp(d,design,verbose=T)
d <- estimateGLMTrendedDisp(d,design)
d <- estimateGLMTagwiseDisp(d,design)
plotBCV(d)
plotMeanVar(d,show.raw=TRUE, show.tagwise=TRUE, show.binned=TRUE)
#calculate differentially expressed genes
fit <- glmFit(d,design)
lrt <- glmLRT(fit)
topTags(lrt)
names(lrt)
lrt$comparison
head(lrt$table)
#export data
saveme<-topTags( lrt, n = Inf , sort.by = "none" )
write.table(saveme,file="EdgeR_bcytvsgut_hiseq.txt",sep="\t",quote=F)
plotSmear(d)
abline(h = c(-2, 2), col = "blue")
ambiguities that i am seeing between data -
Raw data -
Raw_data | ||||||
G000241 | 1580 | 1726 | 1805 | 94 | 82 | 244 |
G000270 | 34849 | 34345 | 36117 | 50908 | 44831 | 46246 |
differential expression result | ||||||
G000241 | -16.2926 | 6.382334 | 1060.65 | 1.18E-232 | 1.20E-232 | |
G000270 | -7.90819 | 11.8414 | 5135.262 | 0 | 0 |
|
Sorry the data doesn't make sense there - here is updated version of sample data