Question: Detection p-value: difference between limma and lumi packages
0
3.2 years ago by
Singapore
Eleni Christodoulou150 wrote:

Hello community,

I am currently trying to fit a linear model to Illumina BeadChp HT12.v4 (genome-wide ~ 48,000 probes) data for detection of differentially expressed genes. I have tried two ways of reading and normalizing the data:

i) limma  and neqc()  and ii) lumi and log2-transformation, quantile normalization

In both cases I selected only the truly expressed probes; the ones which have detection p-value > 2 (I have 48 samples in total). I get totally different numbers of genes using the embedded functions within each package. With detectionCall(lumiBatch_object) from lumi, I get ~ 21,000 probes as significant. With normalized.data$other$Detection from limma, I stay with ~3,800 probes!

This is a big difference for two functions that are supposed to do the same thing! If anyone has an explanation of this output, I would appreciate sharing it with me.

Thank you very much.

Eleni

p-value detection • 2.0k views
modified 3.2 years ago by Gordon Smyth37k • written 3.2 years ago by Eleni Christodoulou150
Answer: Detection p-value: difference between limma and lumi packages
2
3.2 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

Dear Eleni,

There is no difference between the detection calls in the lumi and limma packages. They both simply read the the detection p-values from the Illumina output. If you code correctly, you will get the same number of expressed probes whichever package you use to normalize.

Your limma filtering in particular seems to be incorrect, but you don't give any relevant code to show how you have filtered, so I have no means to diagnose what the problem may be.

Using limma, what output do you get from the following?

x <- read.ilmn('Sample_Probe_Profile.txt', 'Control_Probe_Profile.txt')
propexpr(x)

Also have you checked whether the 'Detection' column from the Illumina arrays actually holds detection p-values? Sometimes it is 1-p instead of p.

Best wishes
Gordon

Dear Gordon,

Thank you very much for your response and your suggestions. Here is the output of the suggested code

> x <- read.ilmn('Sample_Probe_Profile.txt', 'Control_Probe_Profile.txt')
> propexpr(x)
CL75_G3.1    CL75_G3.2    CL75_G3.3    CL75_G5.1    CL75_G5.2    CL75_G5.3    CL86_G2.1   CL131_G3.1   CL230_G2.1
0.5123857    0.5145105    0.5116729    0.5338369    0.5071625    0.5180329    0.4912954    0.5299103    0.4964385
CL230_G4.1   CL230_G4.2   CL230_G4.3    CL86_G3.1   CL131_G4.1   CL230_G3.1   CL230_G3.2   CL230_G3.3   CL230_G5.1
0.5406947    0.5142014    0.4949713    0.5103329    0.5145715    0.5194221    0.4834590    0.4992756    0.5067751
CL75_G1.2.1    CL86_G4.1 CL131_G1.2.1   CL131_G5.1    CL75_G2.1    CL75_G2.2    CL75_G2.3    CL75_G4.1    CL75_G4.2
0.5059741    0.5044861    0.4773444    0.4759903    0.5048583    0.5083045    0.5141312    0.5433061    0.5011829
CL75_G4.3    CL86_G5.1    CL86_G5.2    CL86_G5.3   CL131_G2.1   CL131_G2.2   CL131_G2.3  CL86_G1.2.1 CL230_G1.2.1
0.5053503    0.4827795    0.5196267    0.5048506    0.4793248    0.4968468    0.4794271    0.4932667    0.5198075
CL75_G1.1.1  CL75_G1.1.2  CL75_G1.1.3  CL86_G1.1.1  CL86_G1.1.2  CL86_G1.1.3 CL131_G1.1.1 CL131_G1.1.2 CL131_G1.1.3
0.5417713    0.5396356    0.5185968    0.5101668    0.5058440    0.5297386    0.5305564    0.5052776    0.5103776
CL230_G1.1.1 CL230_G1.1.2 CL230_G1.1.3
0.5227602    0.5309030    0.5123390 

Moreover, I tested the "Detection" column of the Illumina arrays and the values are from 0 to 1...The column is called 'Detection'. How can I understand if this is p-value or 1-p.value?

I also attach the output of summary of my illumina array Lumi object, including the control data:

> summary(dataset.with.control)
Summary of data information:
Data File Information:
Illumina Inc. GenomeStudio version 1.9.0
Normalization = none
Array Content = HumanHT-12_V4_0_R2_15002873_B.BGX.xml
Error Model = none
DateTime = 1/25/2016 9:25 AM
Local Settings = en-US

Major Operation History:
submitted            finished                                                              command lumiVersion
1 2016-01-27 13:03:21 2016-01-27 13:03:36                           lumiR("GefR_All_Sample_Probe_Profile.txt")      2.22.0
2 2016-01-27 13:03:36 2016-01-27 13:03:38 lumiQ(x.lumi = x.lumi, detectionTh = detectionTh, verbose = verbose)      2.22.0

Object Information:
LumiBatch (storageMode: lockedEnvironment)
assayData: 47304 features, 48 samples
element names: beadNum, detection, exprs, se.exprs
protocolData: none
phenoData
sampleNames: CL75_G3.1 CL75_G3.2 ... CL230_G1.1.3 (48 total)
varLabels: sampleID
featureData
featureNames: 6450255 2570615 ... 4120753 (47304 total)
fvarLabels: ProbeID TargetID
experimentData: use 'experimentData(object)'
Annotation:
Control Data: Available
QC information: Please run summary(x, 'QC') for details!

Finally, why do you say that my limma filtering is incorrect? Do you see any error in the code that I posted?

Thank you very much for your feedback.

Best wishes,

Eleni

1

limma is telling you that 48-54% of the probes are expressed in each sample, so you can't possibly end up with only 4000 out of 48,000 probes.

Dear Gordon,

Yes! That was it! I now end u with exactly the same number of expressed probes between the methods. Thank you so much!

I also found this small number of probes weird...but how did you understand that these are exceedance values and not p-values?

Thank you very much for your help.

Best wishes,

Eleni

Moreover, here is my code for getting the detection p-value with limma:

raw.data <- read.ilmn(files="GefR_All_Sample_Probe_Profile.txt", ctrlfiles="Control_Probe_Profile.txt", other.columns="Detection")                                                               norm.data <- neqc(raw.data)
> dim(norm.data)
[1] 47304    48
expressed <- rowSums(norm.data$other$Detection < 0.05) >= 3
norm.data <- norm.data[expressed,]

> dim(norm.data)
[1] 3806   48

...and for getting the detection p-values with lumi:

dataset <- lumiR('GefR_All_Sample_Probe_Profile.txt')

# Keep only the truly detected probes
presentCountDef <- detectionCall(dataset.with.control)  #default parameters: #type='probe', threshold=0.01
> length(which(presentCountDef>2))
[1] 21515

I even set the largest threshold of p.value=0.05 in limma's way of pre-processing! And the results are so different...I guess I may be doing something really basic in the wrong way...

Thank you very much for your time and feedback.

Best wishes,

Eleni

1

I suspect that you need

expressed <- rowSums(norm.data$other$Detection > 0.95) >= 3

because the Detection scores are actually exceedence values rather than p-values.

Dear Gordon,

Yes! That was it! I now end u with exactly the same number of expressed probes between the methods. Thank you so much! I also found this small number of probes weird...but how did you understand that these are exceedance values and not p-values?

Thank you very much for your help.

Best wishes,

Eleni

Dear Gordon,

please sorry to interfere on this matter, but usually as illustrated in the limma vignette, the detection column holds p-values right ? and something similar with > expressed <- rowSums(y$other$Detection < 0.05) >= cutoff (or < 0.01 for more strigent purposes) would be fine, correct ?

2

read.ilmn() simply reads what columns are provided in the Illumina output. Illumina's Genome Studio software sometimes writes p-values (p) and sometimes writes detection scores (1-p), and one cannot determine which has been done from the name of the column. Functions like neqc() always do the right thing because they check internally whether the detection scores are positively or negatively correlated with expression values. However the y$other$Detection values are just the raw values provided by Illumina, so you have to check yourself if you these values for filtering.

It is very easy to check. Just look at a few expression values like x$E[1:,5,1] and the corresponding detection values x$other$Dection[1:5,1], and it will be obvious whether Detection increases with E or decreases with E. In any case, it will always become obvious very soon from the resulting AveExpr values if you accidently select lowly expressed genes instead of highly expressed. ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Gordon Smyth37k 1 So in simple words if are the detection p-values, they will "decrease" (more significant) with the increase of expression in each sample---Many many thanks Gordon !! ADD REPLYlink written 3.2 years ago by svlachavas650 Thank you very much, it is clear now. ADD REPLYlink written 3.2 years ago by Eleni Christodoulou150 Answer: Detection p-value: difference between limma and lumi packages 1 3.2 years ago by svlachavas650 Greece/Athens/National Hellenic Research Foundation svlachavas650 wrote: Dear Eleni, because your description is a bit confusing--thus i will try to give a naive first guess---by default, the detectionCall() function of lumi uses a threshold of detection p val < 0.01---after normalizing with limma, you used the same detection threshold ? For istance, call y your normalized expressionSet, then you used something like y$other$Detection < 0.01 ? or something different ? Nevertheless, it does not make sense at this point of analysis, as you could only expect to have a different number of genes using these two methods, while choosing different cutoffs, as described in the following paper-comparison of the methodologies: http://nar.oxfordjournals.org/content/38/22/e204 Best, Efstathios ADD COMMENTlink written 3.2 years ago by svlachavas650 Thank you Efstathie for your prompt response. Yes, I use the same threshold for detection p.value, which I set to 0.01. So you say that this difference in truly expressed probes is normal? I know that some differences are expected (I have read the paper that you suggest and thank you about that) but such a big difference in numbers!?! Well...the truth is that using both methods, the final set of DE genes between my conditions of interest is very small...and this is what matters at the end ;) Thanks a lot! Eleni ADD REPLYlink written 3.2 years ago by Eleni Christodoulou150 1 Well, i misinterpret a bit your initial question---detection p val should not matter as is independent and the same, regardless of any downstream process and methodology. On the other hand, if i have understand well(and please correct me if im wrong), you state that after statistical inference, the number of differentially expressed probesets that are identified as significant, is far more bigger with lumi than limma ?--but using again limma for differential expression analysis, right-- ? This sounds very weard, but i will try naively to focus on two important parts : 1. If my above notion is correct, which cutoffs you used for statistical significance after testing ? only for instance adjusted p value, or both a log fold change cutoff ? 2. Secondly, to continue from my first question---there is a specific point in the above paper: ..."Our experience in these applied studies is that neqc, neq and vst give roughly similar numbers of DE genes on the basis of P-value alone, but neqc yields fold changes 3–4 times as large as those from logq and nearly five times as large as vstq or vstr. Therefore, neqc yields more DE genes when both fold change and P-value thresholds are used. Limited experience with PCR validation suggests that the larger fold changes returned by neqc are closer to the truth, agreeing with the results reported here with spike-in data sets"..... 3. My final point, is why did you not used the default methods of the lumi package-lumiExpresso() function, like the vst transformation and the rsn normalization ? I don't mean that is wrong ( maybe other people with more experience can adress this matter), but it would be more informative to post exactly which arguments you used with the lumi package ? From my personal experience, when i have used lumi with the default arguments from lumiExpresso() function, and neqc() from limma, i resulted with both cutoffs considered(p-value and fold change) with a bigger number of DE genes with the neqc preprossesing step. Best, Efstathios ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by svlachavas650 Dear Efstathie, Thank you for your detailed insight into my problem. You are right, with lumi pre-processing followed by limma differential expression analysis I do get more DE genes. • lumi preprocessing + limma model fit (eBayes and topTable) ~ 60 DE genes • limma preprocessing neqc + limma model fit (eBayes and topTable) ~ 15 DE genes In both cases I filter both with log fold change and p-value criteria, which I set to 1 and 0.05 respectively. Here is the beginning of my script, where I load the data and the control probe profile and then background-correct them. dataset <- lumiR('Sample_Probe_Profile.txt') dataset.with.control <- addControlData2lumi('Control_Probe_Profile.txt', dataset) ## bg correction and log2-transformation datasetB<-lumiB(dataset.with.control,method='forcePositive') bg.matrix <- exprs(datasetB) #I make expression set of the sample data I extract the exprs component of the bg-corrected dataset and I quantile-normalize between arrays that correspond to technical replicates. I had tried lumiExpresso() and lumiN in the past, but I noticed that the final intensities of my data were 'forced' to follow the same line. After discussing with some colleagues here, we thought that this is 'imposing' certain behavior to the data that may miss a true biological difference. Thus, I decided to do a smoother normalization by quantile-normalizing across the same type of cells only and then combining the results to a matrix: quant.matrix I then follow up with log2 transformation of the quant.matrix. Thank you very much for any insight on the above! Best, Eleni ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by Eleni Christodoulou150 1 Actually i don't totally agree with the biological conclusions you have mentioned above(just my personal opinion, as i have not analyzed the data and i don't have also any plots to inspect)--keep in mind that also quantile normalization "forces" the various distributions of the data to be the same, with also some assumptions about the data. Moreover, why did you use in the lumiB function the argument method="forcePositive" and not the default ? did you not have raw data(that is summarized data) but already background corrected, with many zero-negative values ? Personally i would stick with neqc and not combining "different" methodologies(maybe your implementation is correct, but anyway you could use the lumiExpresso with both log2 and quantile normalization). Moreover, maybe i would perform a more strict filtering and also use the argument trend=TRUE in the eBayes() function--btw what filtering based on the detection values you performed-- ? ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by svlachavas650 Answer: Detection p-value: difference between limma and lumi packages 0 3.2 years ago by Singapore Eleni Christodoulou150 wrote: Dear Efstathie, In response to your questions: 1) Regarding lumiB, the default method is 'none', which assumes that the BeadStudio output data is background corrected. So by default, it will do nothing. I did not want that because I did not background correct my data in Genome Studio. I tried using the bgAdjust method but this resulted in some negative values that could not be handled by the following log2 transformation. Thus, I ended up using the 'forcePositive' method, which prepares its output for log2 transformation. 2) Regarding my filtering criteria at the linear model fit step (both after neqc and after lumi preprocessing) I used: ct<-factor(targets$T790M)  #the mutation I am interested in - I have named the columns of my matrix accordingly
design <- model.matrix(~0+ct+targets$Batch) colnames(design) <- c(levels(ct), 'batch') arrayw <- arrayWeights(gefR, design) dupcor <- duplicateCorrelation(gefR, design, block=targets$Sample, weights=arrayw) #correlation between the samples

fit <- lmFit(gefR, design, block=targets$Sample, correlation=dupcor$consensus.correlation, weights=arrayw)
contrasts <- makeContrasts('pos-neg',levels=design)  #I am interested in the contrast: mutation-positive versus mutation-negative
fit2 <- contrasts.fit(fit, contrasts)
fit2 <- eBayes(fit2, robust=TRUE, trend=TRUE) #I added the trend=TRUE after your suggestion
top.tab = topTable(fit2, adjust='fdr', number=Inf, p.value=0.05) #

At the last step: After neqc, if I additionally put lfc=1, I won't get any significant genes. Using just p-value criteria, I get 20 DE genes. After lumi preprocessing, on the contrary, I am getting 60 DE genes when I apply both p-value and lfc criteria.

My concern (and the source of all these differences) is the big difference in the truly expressed genes I remain with, after each p-value detection filtering. As I wrote at the original post, when using  y$other$Detection < 0.01 I get ~4,000 truly expressed probes, whereas when I use detectionCall() function of lumi I get ~21,000 truly expressed probes. So, much larger a dataset to work with and, obviously, more DE genes detected at the end...

Thank you very much!

Dear Eleni,

yes in your first question there was a misunderstanding-- i meant that for lumiExpresso the default background parameter is "bgAdjust".

Secondly, i think in your above code you missed a part-function to convert your array Weights into a matrix of weights to feed it into the duplicateCorrelation function:

arrayw <- arrayWeights(gefR, design)

w <- asMatrixWeights(arrayw, dim(gefR))

Finally, regarding the other part it maybe be beneficial to give us a small output of the Detection column, in order to see if are indeed detection p-values, or 1-p like Gordon suggested

Best,

Efstathios

1

Dear Efstathie,
Thanks for the comment about arrayw. You are right, I have missed this conversion, but the script worked right (no warning). However, I checked your suggestion and indeed, the $consensus.correlation filed was slightly different: Without conversion to matrix: 0.711 With conversion to matrix: 0.6807601 The final differentially expressed genes, however, were increased just by 2 (when I use combined p.value and logFC criteria for filtering). So yes, it makes a difference, but almost negligible. And here is part of my Detection column of my Sample_Probe_Profile (first few samples and first 4 rows) > head(norm.data$other\$Detection)
CL75_G3.1 CL75_G3.2 CL75_G3.3 CL75_G5.1 CL75_G5.2 CL75_G5.3 CL86_G2.1 CL131_G3.1 CL230_G2.1 CL230_G4.1
0.53506   0.63507   0.58571   0.74935   0.68571   0.82338   0.55065    0.70260    0.60000    0.72078
0.99870   0.99610   0.99870   0.97792   0.98961   0.99870   0.98442    0.91429    0.99870    0.98442
0.48831   0.73636   0.40519   0.28052   0.60130   0.19870   0.65584    0.61818    0.33377    0.41429
0.64545   0.58182   0.46623   0.65195   0.43247   0.61818   0.18052    0.44805    0.26234    0.05584

I also did not understand why these are exceedance values...

Best wishes,

Eleni

Dear Eleni,

thank you also for your feedback !! Regarding your last part, i also posted a common above to Gordon--so as the specialist he could provide us some suggestions-i just naively suspect(without being certain) that maybe if you could see your whole matrix of these values, maybe if there are a lot of very high values, near to ~0.9 or something similar, that could be the case of 1-p--but still again i'm not certain of this matter, because i also heard this first time