Entering edit mode
Justin Jeyakani GIS
▴
60
@justin-jeyakani-gis-6461
Last seen 10.5 years ago
Hello Sir,
I'm doing differential gene exp. by the edgeR package. My data seems
to suitable to analyse by glm method (As per edgeR userguide section
3.5 Comparisons Both Between and Within Subjects).
I have two different matrix to design.
Matrix One:
I have 12 dataset 3 samples from different healthy individuals
(Healthy) of two different stages (npc,ne) and 3 samples from
different individuals (Disease1) of two different stages (npc,ne), so
6 data from each healthy(6) and Disese1 (6) below is my code and works
well. But for my another design matrix it's not!
Matrix Two:
But I want to comapre the All the Healthy (6data from 3 indiauals)
with each of the Disease1 individual (npc,ne). B'ze I'm getting 125
diff.exp genes btn healthyVsDisease1 and I'm expecting more, If I do
individual test of 3 disease1 with Healthy but the I'm getting error
in the "estimateGLMCommonDisp" steps. I couldn't solve the problem. I
have the same problem with my another matrix also. Can guide me to
solve this issue highly appreciated ... below is my code for the above
mentiond two matrix...do I need to define anthing in the generate
factor Level "gl" further? Since Healthy has 3 sets and Disease 1, 2,
3 has one set causing dispersion error.
Matrix1.txt: my input (works fine)
Disease Patient Treatment
1 Healthy 1 npc
2 Healthy 1 ne
3 Healthy 2 npc
4 Healthy 2 ne
5 Healthy 3 npc
6 Healthy 3 ne
7 Disease1 4 npc
8 Disease1 4 ne
9 Disease1 5 npc
10 Disease1 5 ne
11 Disease1 6 npc
12 Disease1 6 ne
SUPT1<-read.table("matrix1.txt",header=TRUE)
summary(SUPT1)
Patient<-gl(3,2,length=12)
Disease <- factor(SUPT1$Disease, levels=c("Healthy","Disease1"))
Treatment <- factor(SUPT1$Treatment, levels=c("npc","ne"))
data.frame(Disease,Patient,Treatment)
Disease Patient Treatment
1 Healthy 1 npc
2 Healthy 1 ne
3 Healthy 2 npc
4 Healthy 2 ne
5 Healthy 3 npc
6 Healthy 3 ne
7 Disease1 1 npc
8 Disease1 1 ne
9 Disease1 2 npc
10 Disease1 2 ne
11 Disease1 3 npc
12 Disease1 3 ne
design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)
colnames(design)
[1] "(Intercept)" "DiseaseDisease1"
[3] "DiseaseHealthy:Patient2" "DiseaseDisease1:Patient2"
[5] "DiseaseHealthy:Patient3" "DiseaseDisease1:Patient3"
[7] "DiseaseHealthy:Treatmentne" "DiseaseDisease1:Treatmentne"
library(edgeR)
count<-read.table("RHN035-46_s1.txt",header=TRUE)
head (count)
colnames(count)
samplename=colnames(count)
cds01<-DGEList(count,group=samplename)
head(cds01)
cds01
summary(cds01)
dim(cds01)
keep<-rowSums(cpm(cds01)>2)>=4
cds01<-cds01[keep,]
dim(cds01)
cds01$sample$lib.size<-colSums(cds01$counts)
y <- estimateGLMCommonDisp(cds01,design)
y <- estimateGLMTrendedDisp(y,design)
y <- estimateGLMTagwiseDisp(y,design)
fit <- glmFit(y, design)
lrt <- glmLRT(fit, coef="DiseaseHealthy:Treatmentne")
topTags(lrt)
detags <- rownames(topTags(lrt)$table)
cpm(y)[detags, order(y$samples$group)]
summary(de <- decideTestsDGE(lrt))
lrt <- glmLRT(fit, coef="DiseaseDisease1:Treatmentne")
topTags(lrt)
detags <- rownames(topTags(lrt)$table)
cpm(y)[detags, order(y$samples$group)]
summary(de <- decideTestsDGE(lrt))
lrt <- glmLRT(fit, contrast=c(0,0,0,0,0,0,-1,1))
topTags(lrt)
detags <- rownames(topTags(lrt)$table)
cpm(y)[detags, order(y$samples$group)]
summary(de <- decideTestsDGE(lrt))
Matrix2.txt:my input (gives Error) ( I have split the Disease1 into 3
Disease sets and need to compare Healthy Vs Disease1/Disease2/Disese3
and genes respond to "ne" in Disease1/2/3)
Disease Patient Treatment
1 Healthy 1 npc
2 Healthy 1 ne
3 Healthy 2 npc
4 Healthy 2 ne
5 Healthy 3 npc
6 Healthy 3 ne
7 Disease1 4 npc
8 Disease1 4 ne
9 Disease2 5 npc
10 Disease2 5 ne
11 Disease3 6 npc
12 Disease3 6 ne
SUPT1<-read.table("matrix2.txt",header=TRUE)
summary(SUPT1)
Patient<-gl(3,2,length=12)
Disease <- factor(SUPT1$Disease,
levels=c("Healthy","Disease1","Disease2","Disease3"))
Treatment <- factor(SUPT1$Treatment, levels=c("npc","ne"))
data.frame(Disease,Patient,Treatment)
Disease Patient Treatment
1 Healthy 1 npc
2 Healthy 1 ne
3 Healthy 2 npc
4 Healthy 2 ne
5 Healthy 3 npc
6 Healthy 3 ne
7 Disease1 1 npc
8 Disease1 1 ne
9 Disease2 2 npc
10 Disease2 2 ne
11 Disease3 3 npc
12 Disease3 3 ne
design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)
colnames(design)
[1] "(Intercept)" "DiseaseDisease1"
[3] "DiseaseDisease2" "DiseaseDisease3"
[5] "DiseaseHealthy:Patient2" "DiseaseDisease1:Patient2"
[7] "DiseaseDisease2:Patient2" "DiseaseDisease3:Patient2"
[9] "DiseaseHealthy:Patient3" "DiseaseDisease1:Patient3"
[11] "DiseaseDisease2:Patient3" "DiseaseDisease3:Patient3"
[13] "DiseaseHealthy:Treatmentne" "DiseaseDisease1:Treatmentne"
[15] "DiseaseDisease2:Treatmentne" "DiseaseDisease3:Treatmentne"
library(edgeR)
count<-read.table("RHN035-46_s1.txt",header=TRUE)
head (count)
colnames(count)
samplename=colnames(count)
cds01<-DGEList(count,group=samplename)
head(cds01)
cds01
summary(cds01)
dim(cds01)
keep<-rowSums(cpm(cds01)>2)>=4
cds01<-cds01[keep,]
dim(cds01)
cds01$sample$lib.size<-colSums(cds01$counts)
y <- estimateGLMCommonDisp(cds01,design)
Warning message:
In estimateGLMCommonDisp.default(y = y$counts, design = design, :
No residual df: setting dispersion to N
Matri3.txt- my input (gives Error)
Disease Patient Treatment
1 Healthy 1 npc
2 Healthy 1 ne
3 Healthy 2 npc
4 Healthy 2 ne
5 Healthy 3 npc
6 Healthy 3 ne
7 Healthy 4 npc
8 Healthy 4 ne
9 Disease1 5 npc
10 Disease1 5 ne
11 Disease2 6 npc
12 Disease2 6 ne
13 Disease2 7 npc
14 Disease2 7 ne
15 Disease3 8 npc
16 Disease3 8 ne
SUPT1<-read.table("matrix2.txt",header=TRUE)
summary(SUPT1)
Patient<-gl(4,2,length=16)
Disease <- factor(SUPT1$Disease,
levels=c("Healthy","Disease1","Disease2","Disease3"))
Treatment <- factor(SUPT1$Treatment, levels=c("npc","ne"))
data.frame(Disease,Patient,Treatment)
Disease Patient Treatment
1 Healthy 1 npc
2 Healthy 1 ne
3 Healthy 2 npc
4 Healthy 2 ne
5 Healthy 3 npc
6 Healthy 3 ne
7 Healthy 4 npc
8 Healthy 4 ne
9 Disease1 1 npc
10 Disease1 1 ne
11 Disease2 2 npc
12 Disease2 2 ne
13 Disease2 3 npc
14 Disease2 3 ne
15 Disease3 4 npc
16 Disease3 4 ne
design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)
colnames(design)
[1] "(Intercept)" "DiseaseDisease1"
[3] "DiseaseDisease2" "DiseaseDisease3"
[5] "DiseaseHealthy:Patient2" "DiseaseDisease1:Patient2"
[7] "DiseaseDisease2:Patient2" "DiseaseDisease3:Patient2"
[9] "DiseaseHealthy:Patient3" "DiseaseDisease1:Patient3"
[11] "DiseaseDisease2:Patient3" "DiseaseDisease3:Patient3"
[13] "DiseaseHealthy:Patient4" "DiseaseDisease1:Patient4"
[15] "DiseaseDisease2:Patient4" "DiseaseDisease3:Patient4"
[17] "DiseaseHealthy:Treatmentne" "DiseaseDisease1:Treatmentne"
[19] "DiseaseDisease2:Treatmentne" "DiseaseDisease3:Treatmentne"
library(edgeR)
count<-read.table("RHN035-46-47-53_51-57_s1-s2gtf_tophatuniq.count.txt
",header=TRUE)
head (count)
colnames(count)
samplename=colnames(count)
cds01<-DGEList(count,group=samplename)
head(cds01)
cds01
summary(cds01)
dim(cds01)
keep<-rowSums(cpm(cds01)>1)>=4
cds01<-cds01[keep,]
dim(cds01)
cds01$sample$lib.size<-colSums(cds01$counts)
y <- estimateGLMCommonDisp(cds01,design)
Error in return(NA, ntags) : multi-argument returns are not permitted
In addition: Warning message:
In estimateGLMTrendedDisp.default(y = y$counts, design = design, :
No residual df: cannot estimate dispersion
Thanks and regards,
Justin
-------------------------------
This e-mail and any attachments are only for the use of the intended
recipient and may be confidential and/or privileged. If you are not
the recipient, please delete it or notify the sender immediately.
Please do not copy or use it for any purpose or disclose the contents
to any other person as it may be an offence under the Official Secrets
Act.
-------------------------------
[[alternative HTML version deleted]]