single color using Limma : very low p-value "1.40456052121752e-60"
1
0
Entering edit mode
bigoun ▴ 40
@bigoun-4795
Last seen 9.7 years ago
Dear all, I'm analysing data single color data (only intensities at F532) using limma. I have 14 arrays in one group and 5 in the other group. I want to compare them so my targets file is as follows FileName Mal Sain M2.gpr 1 0 M3.gpr 1 0 M4.gpr 1 0 M5.gpr 1 0 M6.gpr 1 0 M7.gpr 1 0 M8.gpr 1 0 M9.gpr 1 0 M10.gpr 1 0 M11.gpr 1 0 M12.gpr 1 0 M13.gpr 1 0 M14.gpr 1 0 M15.gpr 1 0 S1.gpr 0 1 S2.gpr 0 1 S3.gpr 0 1 S4.gpr 0 1 S5.gpr 0 1 I use some correction and normalisation methods as recommanded in limma user guide. Afterthat, I make my design and as I have 2 technical replicate (2 spot for the same miRNA) for each array, I wanted to compute duplicatecorrelation coefficient .GAL file as follows : ATF 1 52 5 Type=GenePix ArrayList V1.0 BlockCount=48 BlockType=0 URL=http://genome-www4.stanford.edu/cgi-bin/SGD/locus.pl?locus=[ID] "Block1=2645, 2295, 100, 12, 300, 8, 300" "Block2=7145, 2295, 100, 12, 300, 8, 300" "Block3=11555, 2235, 100, 12, 300, 8, 300" "Block4=16055, 2235, 100, 12, 300, 8, 300" The corfit$consensus is not good (0,36). Do I still use this parameters for lmfit??? Anyway, the more strange is with or without this parameters I have p-adjust value very very low. first line of my result : Block Row Column ID Name X.Intercept. factor.pData.population.sain AveExpr F P.Value adj.P.Val 39 2 10 42554 hsa-miR-923 13.4125293049973 -0.0517387609947947 13.3989138415776 25264.4487822867 1.20563134868457e-63 1.40456052121752e-60 Is anybody help me and what is wrong. As you can see in UNFILTERED section, I have less than 50% genes for the analysis. Could it be the explanation of these strange result?? Thanks in advance My srcipt: library(limma) myFilter = function(X) { H_threshold=2 okFLAG = X$Flags > -49; okm1 =abs(X[,"F532 Median"]-X[,"F532 Mean"]) okm2 = 0.5*(X[,"F532 Median"]+X[,"F532 Mean"]) okH = ((okm1)/okm2) <h_threshold as.numeric(okflag="" &="" okh)}="" targets="readTargets(file=" targets.txt","="" path="NULL," sep="\t" )="" e="read.maimages(targets,source=" genepix",wt.fun="myFilter,columns=list"" (e="F532 Mean" ,eb="B532 Mean" ,names="Name" ))="" #taux="" de="" filtrage="" unfiltered="apply(E$weights,MARGIN=2,FUN=mean)" round(unfiltered,2)="" #="" m2="" m3="" m4="" m5="" m6="" m7="" m8="" m9="" m10="" m11="" m12="" m13="" m14="" m15="" s1="" s2="" #0.41="" 0.43="" 0.34="" 0.29="" 0.51="" 0.35="" 0.48="" 0.35="" 0.29="" 0.56="" 0.38="" 0.56="" 0.61="" 0.52="" 0.48="" 0.34="" #="" s3="" s4="" s5="" #0.36="" 0.46="" 0.47="" #boite="" a="" moustache="" avant="" correction="" boxplot(as.data.frame(e$e),main="Mean intensities" )="" #correction="" background="" -="" normexp="" enorm="" <-="" backgroundcorrect(e,="" method="normexp" ,offset="1)" boxplot(as.data.frame(enorm$e),="" main="Mean intensities - normexp" )="" #normalisation="" quantile="" norme="" <-="" normalizebetweenarrays(enorm,method="quantile" )="" boxplot(as.data.frame(norme$e),="" main="Normalized intensities" )="" norme$e="" <-="" log2(norme$e)="" pdata="" <-="" data.frame(population="c('mal','mal'," 'mal',="" 'mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','sai="" n','sain','sain','sain','sain'))="" #="" create="" design="" matrix="" design="" <-="" model.matrix(~factor(pdata$population))="" #="" in="" my="" .gpr="" files="" all="" mirnas="" contain="" the="" string="" "mir"="" in="" their="" name="" #="" so="" the="" grep="" function="" can="" be="" used="" to="" extract="" all="" of="" these,="" removing="" #="" all="" control="" signals="" and="" printing="" buffers="" etc.="" #="" you="" need="" to="" check="" your="" .gpr="" files="" to="" find="" which="" signals="" you="" should="" extract.="" mirs="" <-="" c(grep("-mir-",="" norme$genes$name),="" grep("-let-",="" norme$genes$name))="" norme.final="" <-="" norme[mirs,="" ]="" norme.final="" <-norme.final[order(norme.final$genes$name),="" ]="" norme.final$genes$name="" #="" calculate="" duplicate="" correlation="" between="" the="" 2="" replicates="" on="" each="" array="" corfit="" <-="" duplicatecorrelation(norme.final,="" design,="" ndups="2)" corfit$consensus="" #="" show="" a="" boxplot="" of="" the="" correlations="" boxplot(tanh(corfit$atanh.correlations))="" #="" fit="" the="" linear="" model,="" including="" info="" on="" duplicates="" and="" correlation="" fit="" <-="" lmfit(norme.final,="" design,="" ndups="2," correlation="corfit$consensus)" #fit="" <-="" lmfit(norme.final,="" design)="" #="" calculate="" values="" using="" ebayes="" ebayes="" <-="" ebayes(fit)="" #="" output="" a="" list="" of="" top="" differnetially="" expressed="" genes...="" result="topTable(ebayes,number=4608," p="0.05,adjust" =="" "bh")="" genes="result$Status==" genes""="" write.table(result,"gene.txt",quote="FALSE,sep=" \t",row.names="FALSE,"" col.names="TRUE)" mikel="" [[alternative="" html="" version="" deleted]]="" <="" div="">
• 1.2k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
I didn't go through the code in great detail, but since these are 1-color data, make sure that you are not testing that the intercept is different than 0 (and it looks like you probably are). Sean On Wed, Aug 10, 2011 at 5:11 AM, bigoun <bigounn at="" yahoo.fr=""> wrote: > Dear all, > > I'm analysing data single color data (only intensities at F532) using limma. > > I have 14 arrays in one group and 5 in the other group. I want to compare them > so my targets file is as follows > FileName Mal Sain > M2.gpr 1 0 > M3.gpr 1 0 > M4.gpr 1 0 > M5.gpr 1 0 > M6.gpr 1 0 > M7.gpr 1 0 > M8.gpr 1 0 > M9.gpr 1 0 > M10.gpr 1 0 > M11.gpr 1 0 > M12.gpr 1 0 > M13.gpr 1 0 > M14.gpr 1 0 > M15.gpr 1 0 > S1.gpr 0 1 > S2.gpr 0 1 > S3.gpr 0 1 > S4.gpr 0 1 > S5.gpr 0 1 > > I use some correction and normalisation methods as recommanded in limma user > guide. Afterthat, I make my design and as I have 2 technical replicate (2 spot > for the same miRNA) for each array, I wanted to compute duplicatecorrelation > coefficient > > > .GAL file as follows : > ATF 1 > 52 5 > Type=GenePix ArrayList V1.0 > BlockCount=48 > BlockType=0 > URL=http://genome-www4.stanford.edu/cgi-bin/SGD/locus.pl?locus=[ID] > "Block1=2645, 2295, 100, 12, 300, 8, 300" > "Block2=7145, 2295, 100, 12, 300, 8, 300" > "Block3=11555, 2235, 100, 12, 300, 8, 300" > "Block4=16055, 2235, 100, 12, 300, 8, 300" > > > The ?corfit$consensus is not good (0,36). Do I still use this parameters for > lmfit??? > > Anyway, the more strange is with or without this parameters I have p-adjust > value very very low. > > first line of my result : > Block Row Column ID Name X.Intercept. factor.pData.population.sain AveExpr F P.Value adj.P.Val > > 39 2 10 42554 hsa-miR-923 13.4125293049973 -0.0517387609947947 13.3989138415776 25264.4487822867 1.20563134868457e-63 1.40456052121752e-60 > > > Is anybody help me and what is wrong. > > As you can see in UNFILTERED section, I have less than 50% genes for the > analysis. Could it be the explanation of these strange result?? > > Thanks in advance > > My srcipt: > library(limma) > myFilter = function(X) { > H_threshold=2 > okFLAG = X$Flags > -49; > okm1 =abs(X[,"F532 Median"]-X[,"F532 Mean"]) > okm2 = 0.5*(X[,"F532 Median"]+X[,"F532 Mean"]) > okH = ((okm1)/okm2) <h_threshold> as.numeric(okFLAG & okH)} > > targets=readTargets(file="targets.txt", path=NULL, sep="\t") > > E =read.maimages(targets,source="genepix",wt.fun=myFilter,columns=li st(E="F532 > Mean",Eb="B532 Mean",Names="Name")) > > #Taux de filtrage > unFiltered = apply(E$weights,MARGIN=2,FUN=mean) > round(unFiltered,2) > > # ?M2 ? M3 ? M4 ? M5 ? M6 ? M7 ? M8 ? M9 ?M10 ?M11 ?M12 ?M13 ?M14 ?M15 ? S1 ? S2 > > #0.41 0.43 0.34 0.29 0.51 0.35 0.48 0.35 0.29 0.56 0.38 0.56 0.61 0.52 0.48 0.34 > > # ?S3 ? S4 ? S5 > #0.36 0.46 0.47 > > #Boite a moustache avant correction > boxplot(as.data.frame(E$E),main="Mean intensities") > > #Correction background - normexp > Enorm <- backgroundCorrect(E, method="normexp",offset=1) > boxplot(as.data.frame(Enorm$E), main="Mean intensities - normexp") > #Normalisation quantile > NormE <- normalizeBetweenArrays(Enorm,method="quantile") > boxplot(as.data.frame(NormE$E), main="Normalized intensities") > > NormE$E <- log2(NormE$E) > > pData <- data.frame(population = c('mal','mal', 'mal', > 'mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','s ain','sain','sain','sain','sain')) > > > # create design matrix > design <- model.matrix(~factor(pData$population)) > > # In my .gpr files all miRNAs contain the string "miR" in their name > # so the grep function can be used to extract all of these, removing > # all control signals and printing buffers etc. > # You need to check your .gpr files to find which signals you should extract. > miRs <- c(grep("-miR-", NormE$genes$Name), grep("-let-", NormE$genes$Name)) > NormE.final <- NormE[miRs, ] > NormE.final <-NormE.final[order(NormE.final$genes$Name), ] > NormE.final$genes$Name > > # calculate duplicate correlation between the 2 replicates on each array > corfit <- duplicateCorrelation(NormE.final, design, ndups=2) > corfit$consensus > # show a boxplot of the correlations > boxplot(tanh(corfit$atanh.correlations)) > > # fit the linear model, including info on duplicates and correlation > fit <- lmFit(NormE.final, design, ndups=2, ?correlation=corfit$consensus) > #fit <- lmFit(NormE.final, design) > # calculate values using ebayes > ebayes <- eBayes(fit) > > # output a list of top differnetially expressed genes... > result=topTable(ebayes,number=4608, p=0.05,adjust = "BH") > genes=result$Status=="genes" > write.table(result,"gene.txt",quote=FALSE,sep="\t",row.names=FALSE, > col.names=TRUE) > > > mikel > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Agreed. Specifying the second coefficient (which excludes the intercept term) in topTable should do what you want: result=topTable(ebayes,coef=2,number=4608, p=0.05,adjust = "BH") Alex On Wed, 10 Aug 2011 13:21:00 -0400, Sean Davis wrote: > I didn't go through the code in great detail, but since these are > 1-color data, make sure that you are not testing that the intercept > is > different than 0 (and it looks like you probably are). > > Sean > > > On Wed, Aug 10, 2011 at 5:11 AM, bigoun <bigounn at="" yahoo.fr=""> wrote: >> Dear all, >> >> I'm analysing data single color data (only intensities at F532) >> using limma. >> >> I have 14 arrays in one group and 5 in the other group. I want to >> compare them >> so my targets file is as follows >> FileName Mal Sain >> M2.gpr 1 0 >> M3.gpr 1 0 >> M4.gpr 1 0 >> M5.gpr 1 0 >> M6.gpr 1 0 >> M7.gpr 1 0 >> M8.gpr 1 0 >> M9.gpr 1 0 >> M10.gpr 1 0 >> M11.gpr 1 0 >> M12.gpr 1 0 >> M13.gpr 1 0 >> M14.gpr 1 0 >> M15.gpr 1 0 >> S1.gpr 0 1 >> S2.gpr 0 1 >> S3.gpr 0 1 >> S4.gpr 0 1 >> S5.gpr 0 1 >> >> I use some correction and normalisation methods as recommanded in >> limma user >> guide. Afterthat, I make my design and as I have 2 technical >> replicate (2 spot >> for the same miRNA) for each array, I wanted to compute >> duplicatecorrelation >> coefficient >> >> >> .GAL file as follows : >> ATF 1 >> 52 5 >> Type=GenePix ArrayList V1.0 >> BlockCount=48 >> BlockType=0 >> URL=http://genome-www4.stanford.edu/cgi-bin/SGD/locus.pl?locus=[ID] >> "Block1=2645, 2295, 100, 12, 300, 8, 300" >> "Block2=7145, 2295, 100, 12, 300, 8, 300" >> "Block3=11555, 2235, 100, 12, 300, 8, 300" >> "Block4=16055, 2235, 100, 12, 300, 8, 300" >> >> >> The ?corfit$consensus is not good (0,36). Do I still use this >> parameters for >> lmfit??? >> >> Anyway, the more strange is with or without this parameters I have >> p-adjust >> value very very low. >> >> first line of my result : >> Block Row Column ID Name X.Intercept. factor.pData.population.sain >> AveExpr F P.Value adj.P.Val >> >> 39 2 10 42554 hsa-miR-923 13.4125293049973 -0.0517387609947947 >> 13.3989138415776 25264.4487822867 1.20563134868457e-63 >> 1.40456052121752e-60 >> >> >> Is anybody help me and what is wrong. >> >> As you can see in UNFILTERED section, I have less than 50% genes for >> the >> analysis. Could it be the explanation of these strange result?? >> >> Thanks in advance >> >> My srcipt: >> library(limma) >> myFilter = function(X) { >> H_threshold=2 >> okFLAG = X$Flags > -49; >> okm1 =abs(X[,"F532 Median"]-X[,"F532 Mean"]) >> okm2 = 0.5*(X[,"F532 Median"]+X[,"F532 Mean"]) >> okH = ((okm1)/okm2) <h_threshold>> as.numeric(okFLAG & okH)} >> >> targets=readTargets(file="targets.txt", path=NULL, sep="\t") >> >> E >> =read.maimages(targets,source="genepix",wt.fun=myFilter,columns=lis t(E="F532 >> Mean",Eb="B532 Mean",Names="Name")) >> >> #Taux de filtrage >> unFiltered = apply(E$weights,MARGIN=2,FUN=mean) >> round(unFiltered,2) >> >> # ?M2 ? M3 ? M4 ? M5 ? M6 ? M7 ? M8 ? M9 ?M10 ?M11 ?M12 ?M13 ?M14 >> ?M15 ? S1 ? S2 >> >> #0.41 0.43 0.34 0.29 0.51 0.35 0.48 0.35 0.29 0.56 0.38 0.56 0.61 >> 0.52 0.48 0.34 >> >> # ?S3 ? S4 ? S5 >> #0.36 0.46 0.47 >> >> #Boite a moustache avant correction >> boxplot(as.data.frame(E$E),main="Mean intensities") >> >> #Correction background - normexp >> Enorm <- backgroundCorrect(E, method="normexp",offset=1) >> boxplot(as.data.frame(Enorm$E), main="Mean intensities - normexp") >> #Normalisation quantile >> NormE <- normalizeBetweenArrays(Enorm,method="quantile") >> boxplot(as.data.frame(NormE$E), main="Normalized intensities") >> >> NormE$E <- log2(NormE$E) >> >> pData <- data.frame(population = c('mal','mal', 'mal', >> >> 'mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','mal',' sain','sain','sain','sain','sain')) >> >> >> # create design matrix >> design <- model.matrix(~factor(pData$population)) >> >> # In my .gpr files all miRNAs contain the string "miR" in their name >> # so the grep function can be used to extract all of these, removing >> # all control signals and printing buffers etc. >> # You need to check your .gpr files to find which signals you should >> extract. >> miRs <- c(grep("-miR-", NormE$genes$Name), grep("-let-", >> NormE$genes$Name)) >> NormE.final <- NormE[miRs, ] >> NormE.final <-NormE.final[order(NormE.final$genes$Name), ] >> NormE.final$genes$Name >> >> # calculate duplicate correlation between the 2 replicates on each >> array >> corfit <- duplicateCorrelation(NormE.final, design, ndups=2) >> corfit$consensus >> # show a boxplot of the correlations >> boxplot(tanh(corfit$atanh.correlations)) >> >> # fit the linear model, including info on duplicates and correlation >> fit <- lmFit(NormE.final, design, ndups=2, >> ?correlation=corfit$consensus) >> #fit <- lmFit(NormE.final, design) >> # calculate values using ebayes >> ebayes <- eBayes(fit) >> >> # output a list of top differnetially expressed genes... >> result=topTable(ebayes,number=4608, p=0.05,adjust = "BH") >> genes=result$Status=="genes" >> write.table(result,"gene.txt",quote=FALSE,sep="\t",row.names=FALSE, >> col.names=TRUE) >> >> >> mikel >> ? ? ? ?[[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Alex Gutteridge
ADD REPLY

Login before adding your answer.

Traffic: 501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6