limma: duplicates handling

0

Entering edit mode

Andres Pinzon ▴ 40

@andres-pinzon-3405

Last seen 9.7 years ago

Hi Everyone, If I have duplicates in each slide of my experiment, how do I tell limma to handle this? I am using duplicateCorrelation() function, but after the whole process the topTable() reports not half of the spots but all of them, For instance, if there are overall 15000 spots in my experiment, and half of them are duplicates, Shouldn't I end up just with 7500 genes? Thank you for your time. best, -- Andr?s Pinz?n http://bioinf.ibun.unal.edu.co/~apinzon/ Bioinformatics Center, Colombia EMBnet node http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 Micology and Phytopathology Laboratory - Los Andes University. http://bioinf.uniandes.edu.co Tel +571 3394949 ext. 2768

• 949 views

ADD COMMENT • link updated 15.0 years ago by Jenny Drnevich ★ 2.0k • written 15.0 years ago by Andres Pinzon ▴ 40

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 1 hour ago

United States

Hi Andres, Andres Pinzon wrote: > Hi Everyone, > > If I have duplicates in each slide of my experiment, how do I tell limma > to handle this? > I am using duplicateCorrelation() function, but after the whole process > the topTable() reports not half of the spots but all of them, > For instance, if there are overall 15000 spots in my experiment, and > half of them are > duplicates, Shouldn't I end up just with 7500 genes? Nope. When you use duplicateCorrelation() you are telling limma to fit a mixed model that accounts for correlation between duplicate spots. But you are not telling limma to take averages and just report one value. Best, Jim > > Thank you for your time. > > best, > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826

ADD COMMENT • link 15.0 years ago James W. MacDonald 65k

0

Entering edit mode

Hi James! On Wed, Apr 22, 2009 at 8:40 AM, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > Hi Andres, > > Andres Pinzon wrote: >> >> Hi Everyone, >> >> If I have duplicates in each slide of my experiment, how do I tell limma >> to handle this? >> I am using duplicateCorrelation() function, but after the whole process >> the topTable() reports not half of the spots but all of them, >> For instance, if there are overall 15000 spots in my experiment, and >> half of them are >> duplicates, Shouldn't I end up just with 7500 genes? > > Nope. When you use duplicateCorrelation() you are telling limma to fit a > mixed model that accounts for correlation between duplicate spots. But you > are not telling limma to take averages and just report one value. Ok, thanks for answering. Any idea on how to tell limma to take average and report just one value? Sorry I really can not figure it out. Best, -- Andr?s Pinz?n http://bioinf.ibun.unal.edu.co/~apinzon/ Bioinformatics Center, Colombia EMBnet node http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 Micology and Phytopathology Laboratory - Los Andes University. http://bioinf.uniandes.edu.co Tel +571 3394949 ext. 2768

ADD REPLY • link 15.0 years ago Andres Pinzon ▴ 40

0

Entering edit mode

Hi Andres, Andres Pinzon wrote: > Hi James! > > On Wed, Apr 22, 2009 at 8:40 AM, James W. MacDonald > <jmacdon at="" med.umich.edu=""> wrote: >> Hi Andres, >> >> Andres Pinzon wrote: >>> Hi Everyone, >>> >>> If I have duplicates in each slide of my experiment, how do I tell limma >>> to handle this? >>> I am using duplicateCorrelation() function, but after the whole process >>> the topTable() reports not half of the spots but all of them, >>> For instance, if there are overall 15000 spots in my experiment, and >>> half of them are >>> duplicates, Shouldn't I end up just with 7500 genes? >> Nope. When you use duplicateCorrelation() you are telling limma to fit a >> mixed model that accounts for correlation between duplicate spots. But you >> are not telling limma to take averages and just report one value. > > Ok, thanks for answering. Any idea on how to tell limma to take > average and report just one value? > Sorry I really can not figure it out. I don't think there are facilities within limma to do this. If this is what you want to do, then you should just compute the averages and then fit the model using the averages. > > Best, > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826

ADD REPLY • link 15.0 years ago James W. MacDonald 65k

0

Entering edit mode

Hi Andres and Jim, Actually, there is a way that limma accounts for the duplicates and only reports one value per clone in topTable. Did you follow the Within-Array Replicate Spot example 11.6 in the limmaUsersGuide()? After you use duplicateCorrelation() to calculate the correlation between spots, you have to modify the call to lmFit: fit <- lmFit(MA2, design, ndups=2, correlation=corfit$consensus) Andres - in the future, it would help if you posted your code that wasn't working along with the output of sessionInfo() so we could see what exactly you did and what you need to do. Cheers, Jenny At 07:59 AM 4/22/2009, James W. MacDonald wrote: >Hi Andres, > >Andres Pinzon wrote: >>Hi James! >>On Wed, Apr 22, 2009 at 8:40 AM, James W. MacDonald >><jmacdon at="" med.umich.edu=""> wrote: >>>Hi Andres, >>> >>>Andres Pinzon wrote: >>>>Hi Everyone, >>>> >>>>If I have duplicates in each slide of my experiment, how do I tell limma >>>>to handle this? >>>>I am using duplicateCorrelation() function, but after the whole process >>>>the topTable() reports not half of the spots but all of them, >>>>For instance, if there are overall 15000 spots in my experiment, and >>>>half of them are >>>>duplicates, Shouldn't I end up just with 7500 genes? >>>Nope. When you use duplicateCorrelation() you are telling limma to fit a >>>mixed model that accounts for correlation between duplicate spots. But you >>>are not telling limma to take averages and just report one value. >>Ok, thanks for answering. Any idea on how to tell limma to take >>average and report just one value? >>Sorry I really can not figure it out. > >I don't think there are facilities within limma to do this. If this >is what you want to do, then you should just compute the averages >and then fit the model using the averages. > > >>Best, > >-- >James W. MacDonald, M.S. >Biostatistician >Douglas Lab >University of Michigan >Department of Human Genetics >5912 Buhl >1241 E. Catherine St. >Ann Arbor MI 48109-5618 >734-615-7826 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

ADD REPLY • link 15.0 years ago Jenny Drnevich ★ 2.0k

0

Entering edit mode

Hi Jenny, Thanks for the advice. This is the part of the code I think is not working: === + +>biolrep <- c(1,1,2,2,3,3) +>corfit <- duplicateCorrelation(MA,ndups=2,block=biolrep, spacing=8) +>fit <- lmFit (MA, block=biolrep, cor=corfit$consensus) +>fit <- eBayes(fit) + === Shoudl I use something like?: fit <- lmFit (MA, block=biolrep, ndups=2, spacing=8, cor=corfit$consensus) best, On Wed, Apr 22, 2009 at 9:21 AM, Jenny Drnevich <drnevich at="" illinois.edu=""> wrote: > Hi Andres and Jim, > > Actually, there is a way that limma accounts for the duplicates and only > reports one value per clone in topTable. Did you follow the Within- Array > Replicate Spot example 11.6 in the limmaUsersGuide()? After you use > duplicateCorrelation() to calculate the correlation between spots, you have > to modify the call to lmFit: > > fit <- lmFit(MA2, design, ndups=2, correlation=corfit$consensus) > > Andres - in the future, it would help if you posted your code that wasn't > working along with the output of sessionInfo() so we could see what exactly > you did and what you need to do. > > Cheers, > Jenny > > At 07:59 AM 4/22/2009, James W. MacDonald wrote: >> >> Hi Andres, >> >> Andres Pinzon wrote: >>> >>> Hi James! >>> On Wed, Apr 22, 2009 at 8:40 AM, James W. MacDonald >>> <jmacdon at="" med.umich.edu=""> wrote: >>>> >>>> Hi Andres, >>>> >>>> Andres Pinzon wrote: >>>>> >>>>> Hi Everyone, >>>>> >>>>> If I have duplicates in each slide of my experiment, how do I tell >>>>> limma >>>>> to handle this? >>>>> I am using duplicateCorrelation() function, but after the whole process >>>>> the topTable() reports not half of the spots but all of them, >>>>> For instance, if there are overall 15000 spots in my experiment, and >>>>> half of them are >>>>> duplicates, Shouldn't I end up just with 7500 genes? >>>> >>>> Nope. When you use duplicateCorrelation() you are telling limma to fit a >>>> mixed model that accounts for correlation between duplicate spots. But >>>> you >>>> are not telling limma to take averages and just report one value. >>> >>> Ok, thanks for answering. Any idea on how to tell limma to take >>> average and report just one value? >>> Sorry I really can not figure it out. >> >> I don't think there are facilities within limma to do this. If this is >> what you want to do, then you should just compute the averages and then fit >> the model using the averages. >> >> >>> Best, >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu > -- Andr?s Pinz?n http://bioinf.ibun.unal.edu.co/~apinzon/ Bioinformatics Center, Colombia EMBnet node http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 Micology and Phytopathology Laboratory - Los Andes University. http://bioinf.uniandes.edu.co Tel +571 3394949 ext. 2768

ADD REPLY • link 15.0 years ago Andres Pinzon ▴ 40

0

Entering edit mode

Jenny Drnevich ★ 2.0k

@jenny-drnevich-2812

Last seen 28 days ago

United States

Hi Andres, Did you read through the help file for duplicateCorrelation? You can't do both block and ndups: "At this time it is not possible to estimate correlations between duplicate spots and between technical replicates simultaneously. If block is not null, then the function will set ndups=1." If you have a blocking variable and technical replicates, you cannot use limma to adjust for both. What to do in this situation has been discussed previously on this list many times; see: https://stat.ethz.ch/pipermail/bioconductor/2008-November/025116.html HTH, Jenny At 08:40 AM 4/22/2009, Andres Pinzon wrote: >Hi Jenny, >Thanks for the advice. This is the part of the code I think is not working: >=== >+ >+>biolrep <- c(1,1,2,2,3,3) >+>corfit <- duplicateCorrelation(MA,ndups=2,block=biolrep, spacing=8) >+>fit <- lmFit (MA, block=biolrep, cor=corfit$consensus) >+>fit <- eBayes(fit) >+ >=== > >Shoudl I use something like?: > >fit <- lmFit (MA, block=biolrep, ndups=2, spacing=8, cor=corfit$consensus) > > > >best, > > >On Wed, Apr 22, 2009 at 9:21 AM, Jenny Drnevich <drnevich at="" illinois.edu=""> wrote: > > Hi Andres and Jim, > > > > Actually, there is a way that limma accounts for the duplicates and only > > reports one value per clone in topTable. Did you follow the Within-Array > > Replicate Spot example 11.6 in the limmaUsersGuide()? After you use > > duplicateCorrelation() to calculate the correlation between spots, you have > > to modify the call to lmFit: > > > > fit <- lmFit(MA2, design, ndups=2, correlation=corfit$consensus) > > > > Andres - in the future, it would help if you posted your code that wasn't > > working along with the output of sessionInfo() so we could see what exactly > > you did and what you need to do. > > > > Cheers, > > Jenny > > > > At 07:59 AM 4/22/2009, James W. MacDonald wrote: > >> > >> Hi Andres, > >> > >> Andres Pinzon wrote: > >>> > >>> Hi James! > >>> On Wed, Apr 22, 2009 at 8:40 AM, James W. MacDonald > >>> <jmacdon at="" med.umich.edu=""> wrote: > >>>> > >>>> Hi Andres, > >>>> > >>>> Andres Pinzon wrote: > >>>>> > >>>>> Hi Everyone, > >>>>> > >>>>> If I have duplicates in each slide of my experiment, how do I tell > >>>>> limma > >>>>> to handle this? > >>>>> I am using duplicateCorrelation() function, but after the whole process > >>>>> the topTable() reports not half of the spots but all of them, > >>>>> For instance, if there are overall 15000 spots in my experiment, and > >>>>> half of them are > >>>>> duplicates, Shouldn't I end up just with 7500 genes? > >>>> > >>>> Nope. When you use duplicateCorrelation() you are telling limma to fit a > >>>> mixed model that accounts for correlation between duplicate spots. But > >>>> you > >>>> are not telling limma to take averages and just report one value. > >>> > >>> Ok, thanks for answering. Any idea on how to tell limma to take > >>> average and report just one value? > >>> Sorry I really can not figure it out. > >> > >> I don't think there are facilities within limma to do this. If this is > >> what you want to do, then you should just > compute the averages and then fit > >> the model using the averages. > >> > >> > >>> Best, > >> > >> -- > >> James W. MacDonald, M.S. > >> Biostatistician > >> Douglas Lab > >> University of Michigan > >> Department of Human Genetics > >> 5912 Buhl > >> 1241 E. Catherine St. > >> Ann Arbor MI 48109-5618 > >> 734-615-7826 > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > Jenny Drnevich, Ph.D. > > > > Functional Genomics Bioinformatics Specialist > > W.M. Keck Center for Comparative and Functional Genomics > > Roy J. Carver Biotechnology Center > > University of Illinois, Urbana-Champaign > > > > 330 ERML > > 1201 W. Gregory Dr. > > Urbana, IL 61801 > > USA > > > > ph: 217-244-7355 > > fax: 217-265-5066 > > e-mail: drnevich at illinois.edu > > > > > >-- >Andr?s Pinz?n >http://bioinf.ibun.unal.edu.co/~apinzon/ >Bioinformatics Center, Colombia EMBnet node >http://bioinf.ibun.unal.edu.co >Tel +57 3165000 ext 16961 Fax +571 3165415 >Micology and Phytopathology Laboratory - Los Andes University. >http://bioinf.uniandes.edu.co >Tel +571 3394949 ext. 2768 Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu

ADD COMMENT • link 15.0 years ago Jenny Drnevich ★ 2.0k

0

Entering edit mode

Hi Jenny, you are completely right. Im going through the ?avedups documentation right now. Thank you. Best, On Wed, Apr 22, 2009 at 11:53 AM, Jenny Drnevich <drnevich at="" illinois.edu=""> wrote: > Hi Andres, > > Did you read through the help file for duplicateCorrelation? You can't do > both block and ndups: > > "At this time it is not possible to estimate correlations between duplicate > spots and between technical replicates simultaneously. If block is not null, > then the function will set ndups=1." > > If you have a blocking variable and technical replicates, you cannot use > limma to adjust for both. What to do in this situation has been discussed > previously on this list many times; see: > https://stat.ethz.ch/pipermail/bioconductor/2008-November/025116.html > > HTH, > Jenny > > > At 08:40 AM 4/22/2009, Andres Pinzon wrote: >> >> Hi Jenny, >> Thanks for the advice. This is the part of the ?code I think is not >> working: >> === >> + >> +>biolrep <- c(1,1,2,2,3,3) >> +>corfit <- duplicateCorrelation(MA,ndups=2,block=biolrep, spacing=8) >> +>fit <- lmFit (MA, block=biolrep, cor=corfit$consensus) >> +>fit <- eBayes(fit) >> + >> === >> >> Shoudl I ?use something like?: >> >> fit <- lmFit (MA, block=biolrep, ndups=2, spacing=8, cor=corfit$consensus) >> >> >> >> best, >> >> >> On Wed, Apr 22, 2009 at 9:21 AM, Jenny Drnevich <drnevich at="" illinois.edu=""> >> wrote: >> > Hi Andres and Jim, >> > >> > Actually, there is a way that limma accounts for the duplicates and only >> > reports one value per clone in topTable. Did you follow the Within-Array >> > Replicate Spot example 11.6 in the limmaUsersGuide()? After you use >> > duplicateCorrelation() to calculate the correlation between spots, you >> > have >> > to modify the call to lmFit: >> > >> > fit <- lmFit(MA2, design, ndups=2, correlation=corfit$consensus) >> > >> > Andres - in the future, it would help if you posted your code that >> > wasn't >> > working along with the output of sessionInfo() so we could see what >> > exactly >> > you did and what you need to do. >> > >> > Cheers, >> > Jenny >> > >> > At 07:59 AM 4/22/2009, James W. MacDonald wrote: >> >> >> >> Hi Andres, >> >> >> >> Andres Pinzon wrote: >> >>> >> >>> Hi James! >> >>> On Wed, Apr 22, 2009 at 8:40 AM, James W. MacDonald >> >>> <jmacdon at="" med.umich.edu=""> wrote: >> >>>> >> >>>> Hi Andres, >> >>>> >> >>>> Andres Pinzon wrote: >> >>>>> >> >>>>> Hi Everyone, >> >>>>> >> >>>>> If I have duplicates in each slide of my experiment, how do I tell >> >>>>> limma >> >>>>> to handle this? >> >>>>> I am using duplicateCorrelation() function, but after the whole >> >>>>> process >> >>>>> the topTable() reports not half of the spots but all of them, >> >>>>> For instance, if there are overall 15000 spots in my experiment, and >> >>>>> half of them are >> >>>>> duplicates, Shouldn't I end up just with 7500 genes? >> >>>> >> >>>> Nope. When you use duplicateCorrelation() you are telling limma to >> >>>> fit a >> >>>> mixed model that accounts for correlation between duplicate spots. >> >>>> But >> >>>> you >> >>>> are not telling limma to take averages and just report one value. >> >>> >> >>> Ok, thanks for answering. Any idea on how to tell limma to take >> >>> average and report just one value? >> >>> Sorry I really can not figure it out. >> >> >> >> I don't think there are facilities within limma to do this. If this is >> >> what you want to do, then you should just compute the averages and then >> >> fit >> >> the model using the averages. >> >> >> >> >> >>> Best, >> >> >> >> -- >> >> James W. MacDonald, M.S. >> >> Biostatistician >> >> Douglas Lab >> >> University of Michigan >> >> Department of Human Genetics >> >> 5912 Buhl >> >> 1241 E. Catherine St. >> >> Ann Arbor MI 48109-5618 >> >> 734-615-7826 >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor at stat.math.ethz.ch >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> > Jenny Drnevich, Ph.D. >> > >> > Functional Genomics Bioinformatics Specialist >> > W.M. Keck Center for Comparative and Functional Genomics >> > Roy J. Carver Biotechnology Center >> > University of Illinois, Urbana-Champaign >> > >> > 330 ERML >> > 1201 W. Gregory Dr. >> > Urbana, IL 61801 >> > USA >> > >> > ph: 217-244-7355 >> > fax: 217-265-5066 >> > e-mail: drnevich at illinois.edu >> > >> >> >> >> -- >> Andr?s Pinz?n >> http://bioinf.ibun.unal.edu.co/~apinzon/ >> Bioinformatics Center, Colombia EMBnet node >> http://bioinf.ibun.unal.edu.co >> Tel +57 3165000 ext 16961 Fax +571 3165415 >> Micology and Phytopathology Laboratory - Los Andes University. >> http://bioinf.uniandes.edu.co >> Tel +571 3394949 ext. 2768 > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu > -- Andr?s Pinz?n http://bioinf.ibun.unal.edu.co/~apinzon/ Bioinformatics Center, Colombia EMBnet node http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 Micology and Phytopathology Laboratory - Los Andes University. http://bioinf.uniandes.edu.co Tel +571 3394949 ext. 2768

ADD REPLY • link 15.0 years ago Andres Pinzon ▴ 40

Login before adding your answer.