Question

Limma question

0

Entering edit mode

Niccolò Bassani ▴ 30

@niccolo-bassani-3933

Last seen 11.4 years ago

Dear users, I'm having some troubles in figuring out what's going on in limma. I've got some expression data from Agilent microRNA platform, I've pre-processed them, and wanted to do some easy differential expression analysis. Out of 1368 miRNAs (no filtering performed) there are 758 of them which show EXACTLY the same value on all of the 24 arrays involved. Arrays are divided in 3 groups, 8 arrays in each group. Data look like this (in matrix form, first rows and columns): LN9 LN10 LN11 LN12 LN13 LN14 1 12.431022 12.186179 13.136163 12.121403 12.643895 12.756163 2 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 3 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 4 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 5 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 6 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 I specify the design matrix, and run easy differential expression code: contrasts = cbind(AvsB = c(-1,1,0),AvsC = c(1,0,-1),AvsB_C = c(1,-1/2,-1/2),A_BvsC = c(1/2,1/2,-1)) contrasts AvsB AvsC AvsB_C A_BvsC [1,] -1 1 1.0 0.5 [2,] 1 0 -0.5 0.5 [3,] 0 -1 -0.5 -1.0 fit = lmFit(agilent,design) fit.contrasts = contrasts.fit(fit,contrasts) test = eBayes(fit.contrasts) The strange (or absurd) thing is that invariant microRNAs appear to be differentially expressed throughout all of the contrasts but the last one! test $p.value AvsB AvsC AvsB_C A_BvsC [1,] 0.53958575 0.42970445 0.41866547 0.5748925 [2,] 0.03471306 0.03471306 0.01644463 1.0000000 [3,] 0.03471306 0.03471306 0.01644463 1.0000000 [4,] 0.03471306 0.03471306 0.01644463 1.0000000 [5,] 1.00000000 0.23359101 0.48667557 0.1713666 1363 more rows ... I've drilled into the various limma functions code, but it seems that there's some problem with my data, maybe some kind of approximation...my point is that the last contrast correctly identifies no microRNA differentially expressed, whereas the remaining 3 return me t statistic which are non 0 for invariant miRNAs!! $t AvsB AvsC AvsB_C A_BvsC [1,] 6.236028e-01 -0.8051982 -0.8249186 -0.5697255 [2,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 [3,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 [4,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 [5,] 1.588357e-14 -1.2263878 -0.7080553 -1.4161107 1363 more rows ... Any suggestions? I've tried to round the dataset to 4 digits but the problem's still there, only changes the contrast with consistently non-differentially expressed genes... Thanx, and merry xmas everybody (know it's early, but who knows what will be next...) Niccol?

limma microRNA limma microRNA • 2.1k views

ADD COMMENT • link updated 14.1 years ago by boczniak767 ▴ 740 • written 14.1 years ago by Niccolò Bassani ▴ 30

score 0 · Answer 1 · 2011-12-19

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 42 minutes ago

WEHI, Melbourne, Australia

Dear Niccolo, I have to tell you that what you claim to have observed is not possible. If the normalized intensities were all equal, then limma would produce t-stat=0 and p-value=0 for any contrast between conditions. So it would seem that you've made a mistake somewhere in collating results. Your email does not contain complete code, so there isn't any way for me to help you find the error. Best wishes Gordon > Date: Fri, 16 Dec 2011 16:58:55 +0100 > From: Niccol? Bassani <biostatistica at="" gmail.com=""> > To: <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Limma question > > Dear users, > I'm having some troubles in figuring out what's going on in limma. > I've got some expression data from Agilent microRNA platform, I've > pre-processed them, and wanted to do some easy differential expression > analysis. Out of 1368 miRNAs (no filtering performed) there are 758 of > them which show EXACTLY the same value on all of the 24 arrays > involved. Arrays are divided in 3 groups, 8 arrays in each group. > Data look like this (in matrix form, first rows and columns): > > LN9 LN10 LN11 LN12 LN13 LN14 > 1 12.431022 12.186179 13.136163 12.121403 12.643895 12.756163 > 2 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 > 3 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 > 4 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 > 5 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 > 6 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 > > I specify the design matrix, and run easy differential expression code: > > contrasts = cbind(AvsB = c(-1,1,0),AvsC = c(1,0,-1),AvsB_C = > c(1,-1/2,-1/2),A_BvsC = c(1/2,1/2,-1)) > contrasts > AvsB AvsC AvsB_C A_BvsC > [1,] -1 1 1.0 0.5 > [2,] 1 0 -0.5 0.5 > [3,] 0 -1 -0.5 -1.0 > > fit = lmFit(agilent,design) > fit.contrasts = contrasts.fit(fit,contrasts) > test = eBayes(fit.contrasts) > > The strange (or absurd) thing is that invariant microRNAs appear to be > differentially expressed throughout all of the contrasts but the last > one! > > test > $p.value > AvsB AvsC AvsB_C A_BvsC > [1,] 0.53958575 0.42970445 0.41866547 0.5748925 > [2,] 0.03471306 0.03471306 0.01644463 1.0000000 > [3,] 0.03471306 0.03471306 0.01644463 1.0000000 > [4,] 0.03471306 0.03471306 0.01644463 1.0000000 > [5,] 1.00000000 0.23359101 0.48667557 0.1713666 > 1363 more rows ... > > I've drilled into the various limma functions code, but it seems that > there's some problem with my data, maybe some kind of > approximation...my point is that the last contrast correctly > identifies no microRNA differentially expressed, whereas the remaining > 3 return me t statistic which are non 0 for invariant miRNAs!! > > $t > AvsB AvsC AvsB_C A_BvsC > [1,] 6.236028e-01 -0.8051982 -0.8249186 -0.5697255 > [2,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 > [3,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 > [4,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 > [5,] 1.588357e-14 -1.2263878 -0.7080553 -1.4161107 > 1363 more rows ... > > Any suggestions? I've tried to round the dataset to 4 digits but the > problem's still there, only changes the contrast with consistently > non-differentially expressed genes... > > Thanx, and merry xmas everybody (know it's early, but who knows what > will be next...) > Niccol? ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 14.1 years ago Gordon Smyth 53k

0

Entering edit mode

Oops, correcting a typo: if the normalized intensities were all equal for any given miRNA, limma would produce t-stat=0 and p-value=1. Gordon On Mon, 19 Dec 2011, Gordon K Smyth wrote: > Dear Niccolo, > > I have to tell you that what you claim to have observed is not possible. If > the normalized intensities were all equal, then limma would produce t-stat=0 > and p-value=0 for any contrast between conditions. So it would seem that > you've made a mistake somewhere in collating results. > > Your email does not contain complete code, so there isn't any way for me to > help you find the error. > > Best wishes > Gordon > >> Date: Fri, 16 Dec 2011 16:58:55 +0100 >> From: Niccol? Bassani <biostatistica at="" gmail.com=""> >> To: <bioconductor at="" stat.math.ethz.ch=""> >> Subject: [BioC] Limma question >> >> Dear users, >> I'm having some troubles in figuring out what's going on in limma. >> I've got some expression data from Agilent microRNA platform, I've >> pre-processed them, and wanted to do some easy differential expression >> analysis. Out of 1368 miRNAs (no filtering performed) there are 758 of >> them which show EXACTLY the same value on all of the 24 arrays >> involved. Arrays are divided in 3 groups, 8 arrays in each group. >> Data look like this (in matrix form, first rows and columns): >> >> LN9 LN10 LN11 LN12 LN13 LN14 >> 1 12.431022 12.186179 13.136163 12.121403 12.643895 12.756163 >> 2 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 >> 3 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 >> 4 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 >> 5 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 >> 6 1.137504 1.137504 1.137504 1.137504 1.137504 1.137504 >> >> I specify the design matrix, and run easy differential expression code: >> >> contrasts = cbind(AvsB = c(-1,1,0),AvsC = c(1,0,-1),AvsB_C = >> c(1,-1/2,-1/2),A_BvsC = c(1/2,1/2,-1)) >> contrasts >> AvsB AvsC AvsB_C A_BvsC >> [1,] -1 1 1.0 0.5 >> [2,] 1 0 -0.5 0.5 >> [3,] 0 -1 -0.5 -1.0 >> >> fit = lmFit(agilent,design) >> fit.contrasts = contrasts.fit(fit,contrasts) >> test = eBayes(fit.contrasts) >> >> The strange (or absurd) thing is that invariant microRNAs appear to be >> differentially expressed throughout all of the contrasts but the last >> one! >> >> test >> $p.value >> AvsB AvsC AvsB_C A_BvsC >> [1,] 0.53958575 0.42970445 0.41866547 0.5748925 >> [2,] 0.03471306 0.03471306 0.01644463 1.0000000 >> [3,] 0.03471306 0.03471306 0.01644463 1.0000000 >> [4,] 0.03471306 0.03471306 0.01644463 1.0000000 >> [5,] 1.00000000 0.23359101 0.48667557 0.1713666 >> 1363 more rows ... >> >> I've drilled into the various limma functions code, but it seems that >> there's some problem with my data, maybe some kind of >> approximation...my point is that the last contrast correctly >> identifies no microRNA differentially expressed, whereas the remaining >> 3 return me t statistic which are non 0 for invariant miRNAs!! >> >> $t >> AvsB AvsC AvsB_C A_BvsC >> [1,] 6.236028e-01 -0.8051982 -0.8249186 -0.5697255 >> [2,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 >> [3,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 >> [4,] 2.257614e+00 -2.2576137 -2.6068677 0.0000000 >> [5,] 1.588357e-14 -1.2263878 -0.7080553 -1.4161107 >> 1363 more rows ... >> >> Any suggestions? I've tried to round the dataset to 4 digits but the >> problem's still there, only changes the contrast with consistently >> non-differentially expressed genes... >> >> Thanx, and merry xmas everybody (know it's early, but who knows what >> will be next...) >> Niccol? > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD REPLY • link 14.1 years ago Gordon Smyth 53k

0

Entering edit mode

Actually, this happens only for one of the investigated contrasts. I know that's strange, and I'm aware that some mistake must have happened when collecting and/or importing data but it seemed quite strange because also "viewing" data I could see no difference at all between intensities. I'll try to figure out what's happened with this crazy dataset and I'll let you know, by the way thanx for the answer! 2011/12/19 Gordon K Smyth <smyth at="" wehi.edu.au="">: > Oops, correcting a typo: if the normalized intensities were all equal for > any given miRNA, limma would produce t-stat=0 and p-value=1. > > Gordon > > > On Mon, 19 Dec 2011, Gordon K Smyth wrote: > >> Dear Niccolo, >> >> I have to tell you that what you claim to have observed is not possible. >> If the normalized intensities were all equal, then limma would produce >> t-stat=0 and p-value=0 for any contrast between conditions. ?So it would >> seem that you've made a mistake somewhere in collating results. >> >> Your email does not contain complete code, so there isn't any way for me >> to help you find the error. >> >> Best wishes >> Gordon >> >>> Date: Fri, 16 Dec 2011 16:58:55 +0100 >>> From: Niccol? Bassani <biostatistica at="" gmail.com=""> >>> To: <bioconductor at="" stat.math.ethz.ch=""> >>> Subject: [BioC] Limma question >>> >>> Dear users, >>> I'm having some troubles in figuring out what's going on in limma. >>> I've got some expression data from Agilent microRNA platform, I've >>> pre-processed them, and wanted to do some easy differential expression >>> analysis. Out of 1368 miRNAs (no filtering performed) there are 758 of >>> them which show EXACTLY the same value on all of the 24 arrays >>> involved. Arrays are divided in 3 groups, 8 arrays in each group. >>> Data look like this (in matrix form, first rows and columns): >>> >>> ? ? ? LN9 ? ? ?LN10 ? ? ?LN11 ? ? ?LN12 ? ? ?LN13 ? ? ?LN14 >>> 1 12.431022 12.186179 13.136163 12.121403 12.643895 12.756163 >>> 2 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 >>> 3 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 >>> 4 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 >>> 5 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 >>> 6 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 ?1.137504 >>> >>> I specify the design matrix, and run easy differential expression code: >>> >>> contrasts = cbind(AvsB = c(-1,1,0),AvsC = c(1,0,-1),AvsB_C = >>> c(1,-1/2,-1/2),A_BvsC = c(1/2,1/2,-1)) >>> contrasts >>> ? ?AvsB AvsC AvsB_C A_BvsC >>> [1,] ? -1 ? ?1 ? ?1.0 ? ?0.5 >>> [2,] ? ?1 ? ?0 ? -0.5 ? ?0.5 >>> [3,] ? ?0 ? -1 ? -0.5 ? -1.0 >>> >>> fit = lmFit(agilent,design) >>> fit.contrasts = contrasts.fit(fit,contrasts) >>> test = eBayes(fit.contrasts) >>> >>> The strange (or absurd) thing is that invariant microRNAs appear to be >>> differentially expressed throughout all of the contrasts but the last >>> one! >>> >>> test >>> $p.value >>> ? ? ? ? ?AvsB ? ? ? AvsC ? ? AvsB_C ? ?A_BvsC >>> [1,] 0.53958575 0.42970445 0.41866547 0.5748925 >>> [2,] 0.03471306 0.03471306 0.01644463 1.0000000 >>> [3,] 0.03471306 0.03471306 0.01644463 1.0000000 >>> [4,] 0.03471306 0.03471306 0.01644463 1.0000000 >>> [5,] 1.00000000 0.23359101 0.48667557 0.1713666 >>> 1363 more rows ... >>> >>> I've drilled into the various limma functions code, but it seems that >>> there's some problem with my data, maybe some kind of >>> approximation...my point is that the last contrast correctly >>> identifies no microRNA differentially expressed, whereas the remaining >>> 3 return me t statistic which are non 0 for invariant miRNAs!! >>> >>> $t >>> ? ? ? ? ? ?AvsB ? ? ? AvsC ? ? AvsB_C ? ? A_BvsC >>> [1,] 6.236028e-01 -0.8051982 -0.8249186 -0.5697255 >>> [2,] 2.257614e+00 -2.2576137 -2.6068677 ?0.0000000 >>> [3,] 2.257614e+00 -2.2576137 -2.6068677 ?0.0000000 >>> [4,] 2.257614e+00 -2.2576137 -2.6068677 ?0.0000000 >>> [5,] 1.588357e-14 -1.2263878 -0.7080553 -1.4161107 >>> 1363 more rows ... >>> >>> Any suggestions? I've tried to round the dataset to 4 digits but the >>> problem's still there, only changes the contrast with consistently >>> non-differentially expressed genes... >>> >>> Thanx, and merry xmas everybody (know it's early, but who knows what >>> will be next...) >>> Niccol? >> >> > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:6}}

ADD REPLY • link 14.1 years ago Niccolò Bassani ▴ 30

score 0 · Answer 2 · 2011-12-21

Hello limma people, I use Limma in the separate channel analysis for two color data, but there are some small problems : 1) While my differential expression analysis is fine after background correction ( backgroundCorrect(...) ), when this step is omitted the following error occur: Error in intraspotCorrelation(MA2, design) : Missing or infinite values found in M or A So, MA$A,$G include NA values, but I don't understand why they appear and how to dill with them, to avoid this error. My raw data doesn't include NA values. 2) I need to moralized intensities of my microarray results, so it can be exported for other programs. Is the following formula a correct way to extract log intensities from MA data ? logR <- MA2$A + (0.5*MA2$M) # red = Cy5 logG <- MA2$A - (0.5*MA2$M) # green = Cy3 Thanks for your previous help, Assaf [[alternative HTML version deleted]]

score 0 · Answer 3 · 2011-12-28

Hi Assaf, > I use Limma in the separate channel analysis for two color data, but > there > are some small problems : > > 1) While my differential expression analysis is fine after background > correction ( backgroundCorrect(...) ), when this step is omitted the > following error occur: > > Error in intraspotCorrelation(MA2, design) : > Missing or infinite values found in M or A > > So, MA$A,$G include NA values, but I don't understand why they appear > and > how to dill with them, to avoid this error. My raw data doesn't > include NA > values. If you use MAList object, missing values are those generated from below-zero values of fluorescence (when background is higher than spot fluorescence). You wrote about "MA$A, $G" - actually in RGList (raw data created by read.maimages with R and G fluorescence) you have G but no A, opposite if you use MAList. One option is to use one of the missing-value estimation approaches (I've used JBPCAfill (not in R) but there are several methods). Alternatively you can background-correct your data with "normexp" and "offset" options in BackgroundCorrect command - consult limma User's Guide. > 2) I need to moralized intensities of my microarray results, so it > can be > exported for other programs. Is the following formula a correct way > to > extract log intensities from MA data ? > > logR <- MA2$A + (0.5*MA2$M) # red = Cy5 > logG <- MA2$A - (0.5*MA2$M) # green = Cy3 Consult RG.MA and MA.RG commands from Limma's command guide. HTH, Maciej Jo?czyk -- Maciej Jonczyk, Department of Plant Molecular Ecophysiology Faculty of Biology, University of Warsaw 02-096 Warsaw, Miecznikowa 1 Poland -- This email was Anti Virus checked by Astaro Security Gateway. http://www.astaro.com