Limma topTable; fold changes look completely different to the normalized data and Limma fold change

0

Entering edit mode

john herbert ▴ 560

@john-herbert-4612

Last seen 9.6 years ago

Dear all, I have a problem with the log Fold changes calculated in Limma. I am using protein abundance index of proteomic data The log2 of this data is normally distributed and after log2, I use quantile normalization This is then the data matrix I use as input to Limma > class(norm_ctw) [1] "matrix" > dim(norm_ctw) [1] 683 9 design <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3))) colnames(design) <- c("cam", "tumour", "wound") fit <- lmFit(norm_ctw, design) contrast.matrix <- makeContrasts(tumour-wound, tumour-cam, levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) topTable(fit2, coef=1, adjust="BH") Taking one gene as an example. NAMPT in tumour versus wound and calculating fold change by hand of normalized data; > norm_ctw["NAMPT",] cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1 wound2 wound3 19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 16.72368 In Excel, calculating log2 fold change using Average of Tumour/Average of wound = T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 16.60262 W3 16.72368 Tumour average = 22.59173 Wound average = 16.50009333 Log2 Fold change = 0.453320567 However, from TopTable.... > topTable(fit2,coef=1) ID logFC AveExpr t P.Value adj.P.Val B 431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 11.409857 >From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change but that is impossible right?? Please can someone explain if I am using Limma wrong or how the fold change can be massively different between "by hand" and with Limma. Thank you very much for any advice. John.

limma limma • 8.3k views

ADD COMMENT • link 11.8 years ago john herbert ▴ 560

0

Entering edit mode

john herbert ▴ 560

@john-herbert-4612

Last seen 9.6 years ago

Dear all, I have a problem with the log Fold changes calculated in Limma. I am using protein abundance index of proteomic data The log2 of this data is normally distributed and after log2, I use quantile normalization This is then the data matrix I use as input to Limma > class(norm_ctw) [1] "matrix" > dim(norm_ctw) [1] 683 9 design <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3))) colnames(design) <- c("cam", "tumour", "wound") fit <- lmFit(norm_ctw, design) contrast.matrix <- makeContrasts(tumour-wound, tumour-cam, levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) topTable(fit2, coef=1, adjust="BH") Taking one gene as an example. NAMPT in tumour versus wound and calculating fold change by hand of normalized data; > norm_ctw["NAMPT",] cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1 wound2 wound3 19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 16.72368 In Excel, calculating log2 fold change using Average of Tumour/Average of wound = T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 16.60262 W3 16.72368 Tumour average = 22.59173 Wound average = 16.50009333 Log2 Fold change = 0.453320567 However, from TopTable.... > topTable(fit2,coef=1) ID logFC AveExpr t P.Value adj.P.Val B 431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 11.409857 >From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change but that is impossible right?? Please can someone explain if I am using Limma wrong or how the fold change can be massively different between "by hand" and with Limma. Thank you very much for any advice. John.

ADD COMMENT • link 11.8 years ago john herbert ▴ 560

0

Entering edit mode

Hi John, On 7/6/2012 3:12 PM, john herbert wrote: > Dear all, > I have a problem with the log Fold changes calculated in Limma. I am > using protein abundance index of proteomic data > The log2 of this data is normally distributed and after log2, I use > quantile normalization > > This is then the data matrix I use as input to Limma > >> class(norm_ctw) > [1] "matrix" > >> dim(norm_ctw) > [1] 683 9 > design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3))) > colnames(design)<- c("cam", "tumour", "wound") > fit<- lmFit(norm_ctw, design) > > contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design) > fit2<- contrasts.fit(fit, contrast.matrix) > fit2<- eBayes(fit2) > > topTable(fit2, coef=1, adjust="BH") > > Taking one gene as an example. NAMPT in tumour versus wound and > calculating fold change by hand of normalized data; > >> norm_ctw["NAMPT",] > cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1 > wound2 wound3 > 19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 16.72368 > > In Excel, calculating log2 fold change using Average of Tumour/Average > of wound = > T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 16.60262 W3 16.72368 > Tumour average = 22.59173 > Wound average = 16.50009333 > Log2 Fold change = 0.453320567 Wait a minute... Are these data logged or not? You say above that you take logs and then normalize, and then you present some data that would be really big if they were log2 variates (but then I have no idea of the scale for protein abundance data). Anyway, you are acting like these data are not logged, whereas limma assumes they are. So you either have to take logs before feeding into limma, or you need to compute the fold change by subtraction (if the data above are already logged). Best, Jim > > > However, from TopTable.... >> topTable(fit2,coef=1) > ID logFC AveExpr t P.Value adj.P.Val B > 431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 11.409857 > > > From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change > but that is impossible right?? > > Please can someone explain if I am using Limma wrong or how the fold > change can be massively different between "by hand" and with Limma. > > Thank you very much for any advice. > > John. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 11.8 years ago James W. MacDonald 65k

0

Entering edit mode

Thanks a lot James, Yes, the raw PAI values are very big and I am feeding Limma log2 and normalized values. So if I have a log2 value of 22.59173 for Tumour and a log2 value for wound of 16.50009333 subtracting tumour - wound = 6.09 (the same number toptable comes up with) What is this 6.09 value, is that fold change or log2 fold change? I would guess fold change as 2 to the power of 6 = 64 fold change but topTable labels it as logFC; please explain why? Thank you, John. >> Tumour average = 22.59173 >> Wound average = 16.50009333 >> Log2 Fold change = 0.453320567 On Fri, Jul 6, 2012 at 8:36 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Hi John, > > > On 7/6/2012 3:12 PM, john herbert wrote: >> >> Dear all, >> I have a problem with the log Fold changes calculated in Limma. I am >> using protein abundance index of proteomic data >> The log2 of this data is normally distributed and after log2, I use >> quantile normalization >> >> This is then the data matrix I use as input to Limma >> >>> class(norm_ctw) >> >> [1] "matrix" >> >>> dim(norm_ctw) >> >> [1] 683 9 >> design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3))) >> colnames(design)<- c("cam", "tumour", "wound") >> fit<- lmFit(norm_ctw, design) >> >> contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design) >> fit2<- contrasts.fit(fit, contrast.matrix) >> fit2<- eBayes(fit2) >> >> topTable(fit2, coef=1, adjust="BH") >> >> Taking one gene as an example. NAMPT in tumour versus wound and >> calculating fold change by hand of normalized data; >> >>> norm_ctw["NAMPT",] >> >> cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1 >> wound2 wound3 >> 19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 >> 16.72368 >> >> In Excel, calculating log2 fold change using Average of Tumour/Average >> of wound = >> T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 >> 16.60262 W3 16.72368 >> Tumour average = 22.59173 >> Wound average = 16.50009333 >> Log2 Fold change = 0.453320567 > > > Wait a minute... Are these data logged or not? You say above that you take > logs and then normalize, and then you present some data that would be really > big if they were log2 variates (but then I have no idea of the scale for > protein abundance data). > > Anyway, you are acting like these data are not logged, whereas limma assumes > they are. So you either have to take logs before feeding into limma, or you > need to compute the fold change by subtraction (if the data above are > already logged). > > Best, > > Jim > > > >> >> >> However, from TopTable.... >>> >>> topTable(fit2,coef=1) >> >> ID logFC AveExpr t P.Value adj.P.Val >> B >> 431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 >> 11.409857 >> >> > From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change >> but that is impossible right?? >> >> Please can someone explain if I am using Limma wrong or how the fold >> change can be massively different between "by hand" and with Limma. >> >> Thank you very much for any advice. >> >> John. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 >

ADD REPLY • link 11.8 years ago john herbert ▴ 560

0

Entering edit mode

OK, I solved it using raw values and see 6.09 is log2 FC. Thanks. John. On Fri, Jul 6, 2012 at 9:00 PM, john herbert <arraystruggles at="" gmail.com=""> wrote: > Thanks a lot James, > Yes, the raw PAI values are very big and I am feeding Limma log2 and > normalized values. > > So if I have a log2 value of 22.59173 for Tumour and a log2 value for > wound of 16.50009333 > subtracting tumour - wound = 6.09 (the same number toptable comes up with) > > What is this 6.09 value, is that fold change or log2 fold change? > > I would guess fold change as 2 to the power of 6 = 64 fold change but > topTable labels it as logFC; please explain why? > > Thank you, > > John. > > > > > >>> Tumour average = 22.59173 >>> Wound average = 16.50009333 >>> Log2 Fold change = 0.453320567 > > > > On Fri, Jul 6, 2012 at 8:36 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: >> Hi John, >> >> >> On 7/6/2012 3:12 PM, john herbert wrote: >>> >>> Dear all, >>> I have a problem with the log Fold changes calculated in Limma. I am >>> using protein abundance index of proteomic data >>> The log2 of this data is normally distributed and after log2, I use >>> quantile normalization >>> >>> This is then the data matrix I use as input to Limma >>> >>>> class(norm_ctw) >>> >>> [1] "matrix" >>> >>>> dim(norm_ctw) >>> >>> [1] 683 9 >>> design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3))) >>> colnames(design)<- c("cam", "tumour", "wound") >>> fit<- lmFit(norm_ctw, design) >>> >>> contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design) >>> fit2<- contrasts.fit(fit, contrast.matrix) >>> fit2<- eBayes(fit2) >>> >>> topTable(fit2, coef=1, adjust="BH") >>> >>> Taking one gene as an example. NAMPT in tumour versus wound and >>> calculating fold change by hand of normalized data; >>> >>>> norm_ctw["NAMPT",] >>> >>> cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1 >>> wound2 wound3 >>> 19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 >>> 16.72368 >>> >>> In Excel, calculating log2 fold change using Average of Tumour/Average >>> of wound = >>> T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 >>> 16.60262 W3 16.72368 >>> Tumour average = 22.59173 >>> Wound average = 16.50009333 >>> Log2 Fold change = 0.453320567 >> >> >> Wait a minute... Are these data logged or not? You say above that you take >> logs and then normalize, and then you present some data that would be really >> big if they were log2 variates (but then I have no idea of the scale for >> protein abundance data). >> >> Anyway, you are acting like these data are not logged, whereas limma assumes >> they are. So you either have to take logs before feeding into limma, or you >> need to compute the fold change by subtraction (if the data above are >> already logged). >> >> Best, >> >> Jim >> >> >> >>> >>> >>> However, from TopTable.... >>>> >>>> topTable(fit2,coef=1) >>> >>> ID logFC AveExpr t P.Value adj.P.Val >>> B >>> 431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 >>> 11.409857 >>> >>> > From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change >>> but that is impossible right?? >>> >>> Please can someone explain if I am using Limma wrong or how the fold >>> change can be massively different between "by hand" and with Limma. >>> >>> Thank you very much for any advice. >>> >>> John. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >>

ADD REPLY • link 11.8 years ago john herbert ▴ 560

0

Entering edit mode

Yep. You have to remember that log2(this/that) = log2(this) - log2(that), so if you are in the log space you have to subtract to compute what would be division on the natural scale. Best, Jim On 7/6/2012 4:19 PM, john herbert wrote: > OK, I solved it using raw values and see 6.09 is log2 FC. > Thanks. > > John. > > On Fri, Jul 6, 2012 at 9:00 PM, john herbert<arraystruggles at="" gmail.com=""> wrote: >> Thanks a lot James, >> Yes, the raw PAI values are very big and I am feeding Limma log2 and >> normalized values. >> >> So if I have a log2 value of 22.59173 for Tumour and a log2 value for >> wound of 16.50009333 >> subtracting tumour - wound = 6.09 (the same number toptable comes up with) >> >> What is this 6.09 value, is that fold change or log2 fold change? >> >> I would guess fold change as 2 to the power of 6 = 64 fold change but >> topTable labels it as logFC; please explain why? >> >> Thank you, >> >> John. >> >> >> >> >> >>>> Tumour average = 22.59173 >>>> Wound average = 16.50009333 >>>> Log2 Fold change = 0.453320567 >> >> >> On Fri, Jul 6, 2012 at 8:36 PM, James W. MacDonald<jmacdon at="" uw.edu=""> wrote: >>> Hi John, >>> >>> >>> On 7/6/2012 3:12 PM, john herbert wrote: >>>> Dear all, >>>> I have a problem with the log Fold changes calculated in Limma. I am >>>> using protein abundance index of proteomic data >>>> The log2 of this data is normally distributed and after log2, I use >>>> quantile normalization >>>> >>>> This is then the data matrix I use as input to Limma >>>> >>>>> class(norm_ctw) >>>> [1] "matrix" >>>> >>>>> dim(norm_ctw) >>>> [1] 683 9 >>>> design<- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3))) >>>> colnames(design)<- c("cam", "tumour", "wound") >>>> fit<- lmFit(norm_ctw, design) >>>> >>>> contrast.matrix<- makeContrasts(tumour-wound, tumour-cam, levels=design) >>>> fit2<- contrasts.fit(fit, contrast.matrix) >>>> fit2<- eBayes(fit2) >>>> >>>> topTable(fit2, coef=1, adjust="BH") >>>> >>>> Taking one gene as an example. NAMPT in tumour versus wound and >>>> calculating fold change by hand of normalized data; >>>> >>>>> norm_ctw["NAMPT",] >>>> cam1 cam2 cam3 tumour1 tumour2 tumour3 wound1 >>>> wound2 wound3 >>>> 19.80164 19.46355 19.26075 22.75347 22.62651 22.39521 16.17398 16.60262 >>>> 16.72368 >>>> >>>> In Excel, calculating log2 fold change using Average of Tumour/Average >>>> of wound = >>>> T1 22.75347 T2 22.62651 T3 22.39521 W1 16.17398 W2 >>>> 16.60262 W3 16.72368 >>>> Tumour average = 22.59173 >>>> Wound average = 16.50009333 >>>> Log2 Fold change = 0.453320567 >>> >>> Wait a minute... Are these data logged or not? You say above that you take >>> logs and then normalize, and then you present some data that would be really >>> big if they were log2 variates (but then I have no idea of the scale for >>> protein abundance data). >>> >>> Anyway, you are acting like these data are not logged, whereas limma assumes >>> they are. So you either have to take logs before feeding into limma, or you >>> need to compute the fold change by subtraction (if the data above are >>> already logged). >>> >>> Best, >>> >>> Jim >>> >>> >>> >>>> >>>> However, from TopTable.... >>>>> topTable(fit2,coef=1) >>>> ID logFC AveExpr t P.Value adj.P.Val >>>> B >>>> 431 NAMPT 6.091632 19.53349 20.16810 2.688444e-09 1.750946e-06 >>>> 11.409857 >>>> >>>>> From toptable, NAMPT has an apparent log2 FC of 6 or 64 fold change >>>> but that is impossible right?? >>>> >>>> Please can someone explain if I am using Limma wrong or how the fold >>>> change can be massively different between "by hand" and with Limma. >>>> >>>> Thank you very much for any advice. >>>> >>>> John. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 11.8 years ago James W. MacDonald 65k

Login before adding your answer.