Limma for GoldenGate Methylation Cancer Panel I

0

Entering edit mode

Jinyan Huang ▴ 50

@jinyan-huang-4153

Last seen 9.6 years ago

Does anyone have the exprience to use limma for two-color array: GoldenGate Methylation Cancer Panel I (Golden Gate Cancer Panel Methylation Illumina) I used it to analysis Methylation data for finding the different methlated genes, but the result is not good. There are too many small p-value the result. there are biology repeat in my data. My R code is like this: library(limma) exp<-read.table("exp.txt",F) sample_id<-read.table("sample_id",F) row.names(exp)<-exp[,1] exp<-exp[,-1] design<-read.table("design.txt",F) explow<-exp[,design[1,]==-1] exphigh<-exp[,design[1,]==1] expsort<-cbind(explow,exphigh) idlow<-sample_id[design[1,]==-1] idhigh<-sample_id[design[1,]==1] idsort<-c(idlow,idhigh) colred<-rep("red",length(exphigh[1,])) collow<-rep("blue",length(explow[1,])) col<-c(collow,colred) MA<-as.matrix(expsort) exp_norm<-normalizeBetweenArrays(MA,method="scale") design_sort<-c(rep(-1,length(collow)),rep(1,length(colred))) fit <- lmFit(MA,design_sort) fit <- eBayes(fit) mylist<-topTable(fit,number=Inf,adjust="BH")

Cancer limma Cancer limma • 1.1k views

ADD COMMENT • link 13.8 years ago Jinyan Huang ▴ 50

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 30 minutes ago

United States

Hi Jinyan Huang, On 7/6/2010 4:08 AM, Jinyan Huang wrote: > Does anyone have the exprience to use limma for two-color array: > GoldenGate Methylation Cancer Panel I (Golden Gate Cancer Panel > Methylation Illumina) > > I used it to analysis Methylation data for finding the different > methlated genes, but the result is not good. There are too many small > p-value the result. there are biology repeat in my data. My R code is > like this: > > library(limma) > exp<-read.table("exp.txt",F) > sample_id<-read.table("sample_id",F) > row.names(exp)<-exp[,1] > exp<-exp[,-1] > design<-read.table("design.txt",F) > explow<-exp[,design[1,]==-1] > exphigh<-exp[,design[1,]==1] > expsort<-cbind(explow,exphigh) > idlow<-sample_id[design[1,]==-1] > idhigh<-sample_id[design[1,]==1] > idsort<-c(idlow,idhigh) > colred<-rep("red",length(exphigh[1,])) > collow<-rep("blue",length(explow[1,])) > col<-c(collow,colred) > MA<-as.matrix(expsort) > exp_norm<-normalizeBetweenArrays(MA,method="scale") > design_sort<-c(rep(-1,length(collow)),rep(1,length(colred))) Wow. That's a lot of code to end up with a two-column matrix consisting of a column of -1s and a column of 1s. Is there some reason that modelMatrix() doesn't do what you want? I also suspect that the design matrix you came up with isn't correct for your experiment. I can't envision how the design matrix you have makes any sense. But without knowing the experimental design, I can't say for sure. I would recommend finding a local statistician who might be able to help you with this analysis. Best, Jim > fit<- lmFit(MA,design_sort) > fit<- eBayes(fit) > mylist<-topTable(fit,number=Inf,adjust="BH") > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD COMMENT • link 13.8 years ago James W. MacDonald 65k

0

Entering edit mode

On Tue, Jul 6, 2010 at 9:31 AM, James W. MacDonald <jmacdon@med.umich.edu>wrote: > Hi Jinyan Huang, > > On 7/6/2010 4:08 AM, Jinyan Huang wrote: > >> Does anyone have the exprience to use limma for two-color array: >> GoldenGate Methylation Cancer Panel I (Golden Gate Cancer Panel >> Methylation Illumina) >> >> I used it to analysis Methylation data for finding the different >> methlated genes, but the result is not good. There are too many small >> p-value the result. there are biology repeat in my data. My R code is >> like this: >> >> library(limma) >> exp<-read.table("exp.txt",F) >> sample_id<-read.table("sample_id",F) >> row.names(exp)<-exp[,1] >> exp<-exp[,-1] >> design<-read.table("design.txt",F) >> explow<-exp[,design[1,]==-1] >> exphigh<-exp[,design[1,]==1] >> expsort<-cbind(explow,exphigh) >> idlow<-sample_id[design[1,]==-1] >> idhigh<-sample_id[design[1,]==1] >> idsort<-c(idlow,idhigh) >> colred<-rep("red",length(exphigh[1,])) >> collow<-rep("blue",length(explow[1,])) >> col<-c(collow,colred) >> MA<-as.matrix(expsort) >> exp_norm<-normalizeBetweenArrays(MA,method="scale") >> > And this normalization is almost certainly not going to get you what you want, assuming they are the "beta" values from Illumina. Sean > design_sort<-c(rep(-1,length(collow)),rep(1,length(colred))) >> > > Wow. That's a lot of code to end up with a two-column matrix consisting of > a column of -1s and a column of 1s. Is there some reason that modelMatrix() > doesn't do what you want? > > I also suspect that the design matrix you came up with isn't correct for > your experiment. I can't envision how the design matrix you have makes any > sense. But without knowing the experimental design, I can't say for sure. > > I would recommend finding a local statistician who might be able to help > you with this analysis. > > Best, > > Jim > > > fit<- lmFit(MA,design_sort) >> fit<- eBayes(fit) >> mylist<-topTable(fit,number=Inf,adjust="BH") >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD REPLY • link 13.8 years ago Sean Davis 21k

0

Entering edit mode

Hi, Jinyan I believe limma is suitable for GoldenGate Cancer Panel data assuming you are using an appropriate normalization method and the design matrix. Perhaps you should consider using the 'methylumi' package to preprocess (QA and normalization) your data before proceeding the differential methylation analysis. Chao-Jen On 07/06/10 06:39, Sean Davis wrote: > On Tue, Jul 6, 2010 at 9:31 AM, James W. MacDonald <jmacdon at="" med.umich.edu="">wrote: > > >> Hi Jinyan Huang, >> >> On 7/6/2010 4:08 AM, Jinyan Huang wrote: >> >> >>> Does anyone have the exprience to use limma for two-color array: >>> GoldenGate Methylation Cancer Panel I (Golden Gate Cancer Panel >>> Methylation Illumina) >>> >>> I used it to analysis Methylation data for finding the different >>> methlated genes, but the result is not good. There are too many small >>> p-value the result. there are biology repeat in my data. My R code is >>> like this: >>> >>> library(limma) >>> exp<-read.table("exp.txt",F) >>> sample_id<-read.table("sample_id",F) >>> row.names(exp)<-exp[,1] >>> exp<-exp[,-1] >>> design<-read.table("design.txt",F) >>> explow<-exp[,design[1,]==-1] >>> exphigh<-exp[,design[1,]==1] >>> expsort<-cbind(explow,exphigh) >>> idlow<-sample_id[design[1,]==-1] >>> idhigh<-sample_id[design[1,]==1] >>> idsort<-c(idlow,idhigh) >>> colred<-rep("red",length(exphigh[1,])) >>> collow<-rep("blue",length(explow[1,])) >>> col<-c(collow,colred) >>> MA<-as.matrix(expsort) >>> exp_norm<-normalizeBetweenArrays(MA,method="scale") >>> >>> >> > And this normalization is almost certainly not going to get you what you > want, assuming they are the "beta" values from Illumina. > > Sean > > > > >> design_sort<-c(rep(-1,length(collow)),rep(1,length(colred))) >> >>> >> Wow. That's a lot of code to end up with a two-column matrix consisting of >> a column of -1s and a column of 1s. Is there some reason that modelMatrix() >> doesn't do what you want? >> >> I also suspect that the design matrix you came up with isn't correct for >> your experiment. I can't envision how the design matrix you have makes any >> sense. But without knowing the experimental design, I can't say for sure. >> >> I would recommend finding a local statistician who might be able to help >> you with this analysis. >> >> Best, >> >> Jim >> >> >> fit<- lmFit(MA,design_sort) >> >>> fit<- eBayes(fit) >>> mylist<-topTable(fit,number=Inf,adjust="BH") >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> ********************************************************** >> Electronic Mail is not secure, may not be read every day, and should not be >> used for urgent or sensitive issues >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Chao-Jen Wong Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., M1-B514 PO Box 19024 Seattle, WA 98109 206.667.4485 cwon2 at fhcrc.org

ADD REPLY • link 13.8 years ago Chao-Jen Wong ▴ 580

0

Entering edit mode

Jinyan Huang ▴ 50

@jinyan-huang-4153

Last seen 9.6 years ago

Thanks for your comments. Because this is the first time I used limma package. So I am very worry I make a mistake. The result of analysis is not very reasonable from my analysis. There are too many small p-value in the result. I just want to use limma to find the different methlylation gene for two phonotypes. I am not sure that limma is suitable for this kind of data. Thanks. On Tue, Jul 6, 2010 at 3:59 PM, James MacDonald <jmacdon at="" med.umich.edu=""> wrote: > > >>>> Jinyan Huang <huang.tju at="" gmail.com=""> wrote: >> But in the userguider.pdf, for the "Swirl Zebra sh" example, design is >> defined as: design <- c(-1,1,-1,1). I just fellow this example. > > My bad. I saw a cbind() where you actually had a c(). > > >> >> On Tue, Jul 6, 2010 at 3:31 PM, James W. MacDonald >> <jmacdon at="" med.umich.edu=""> wrote: >>> Hi Jinyan Huang, >>> >>> On 7/6/2010 4:08 AM, Jinyan Huang wrote: >>>> >>>> Does anyone have the exprience to use limma for two-color array: >>>> GoldenGate Methylation Cancer Panel I (Golden Gate Cancer Panel >>>> Methylation Illumina) >>>> >>>> I used it to analysis Methylation data for finding the different >>>> methlated genes, but the result is not good. There are too many small >>>> p-value the result. there are biology repeat in my data. My R code is >>>> like this: >>>> >>>> library(limma) >>>> exp<-read.table("exp.txt",F) >>>> sample_id<-read.table("sample_id",F) >>>> row.names(exp)<-exp[,1] >>>> exp<-exp[,-1] >>>> design<-read.table("design.txt",F) >>>> explow<-exp[,design[1,]==-1] >>>> exphigh<-exp[,design[1,]==1] >>>> expsort<-cbind(explow,exphigh) >>>> idlow<-sample_id[design[1,]==-1] >>>> idhigh<-sample_id[design[1,]==1] >>>> idsort<-c(idlow,idhigh) >>>> colred<-rep("red",length(exphigh[1,])) >>>> collow<-rep("blue",length(explow[1,])) >>>> col<-c(collow,colred) >>>> MA<-as.matrix(expsort) >>>> exp_norm<-normalizeBetweenArrays(MA,method="scale") >>>> design_sort<-c(rep(-1,length(collow)),rep(1,length(colred))) >>> >>> Wow. That's a lot of code to end up with a two-column matrix consisting of a >>> column of -1s and a column of 1s. Is there some reason that modelMatrix() >>> doesn't do what you want? >>> >>> I also suspect that the design matrix you came up with isn't correct for >>> your experiment. I can't envision how the design matrix you have makes any >>> sense. But without knowing the experimental design, I can't say for sure. >>> >>> I would recommend finding a local statistician who might be able to help you >>> with this analysis. >>> >>> Best, >>> >>> Jim >>> >>> >>>> fit<- lmFit(MA,design_sort) >>>> fit<- eBayes(fit) >>>> mylist<-topTable(fit,number=Inf,adjust="BH") >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> Douglas Lab >>> University of Michigan >>> Department of Human Genetics >>> 5912 Buhl >>> 1241 E. Catherine St. >>> Ann Arbor MI 48109-5618 >>> 734-615-7826 >>> ********************************************************** >>> Electronic Mail is not secure, may not be read every day, and should not be >>> used for urgent or sensitive issues >>> > > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues >

ADD COMMENT • link 13.8 years ago Jinyan Huang ▴ 50

Login before adding your answer.