Differential protein/biomarker expression using limma: is that possible?
1
1
Entering edit mode
@d1feef57
Last seen 13 months ago
Rijnsburg

Hey there! Hope everyone is doing well :)

I have a question regarding using LIMMA package for data that is not RNA-seq nor microarray. I have a dataset of protein/biomarker quantification(around 365 proteins) and I would like to get log-fold changes(i.e. using differential protein expression) based on my conditions of interest. However, the used measurement technique for the dataset I have(Proximity Extension Assay technology) does not provide absolute expression/quantification, but normalized protein expression (NPX). NPX(click here for more details) is an arbitrary unit on Log2 scale. These normalized expressions(their expressions are normally distributed) are highly correlated with absolute quantification of proteins(spearman's correlation can reach up to 0.85 for the same proteins).

My understanding is since the data that I have is normalized quantification/expression, which is similar to what is done in microarrays(normalized Microarray intensity values), the data I have can be analyzed using the same pipelines for microarrays. However, in the user guide of limma, I could not find an explanation/mention about whether limma could also be used in such settings/data/applications.

My questions are :

1) is it possible to use limma to find differentially expressed proteins in my case? Also, is it a valid way for such analysis?

2) if yes, should I also set trend=T and robust =T or just use the normal pipeline?

3) if that's not possible, any thoughts or suggestions to do differential protein expression?

NOTE: this question has already been posted in BIOSTARS, but reposted here after a suggestion from ATpoint

Proteomics NPX limma • 942 views
1
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

I have no experience with NPX but, from the information you give here, limma should be analyse it using the same pipeline as for single channel microarrays. It sounds analogous to PCR data for which limma has been used successfully.

I don't know whether trend=TRUE will be necessary. Try it and see. Use plotSA(fit) to examine the trend. Same with robust=TRUE.

0
Entering edit mode

Dear professor Gordon Smyth, Thank you very much for your answers, I really appreciate your help and time!

I have 3 follow-up questions if I may:

1) I did the analyis using trend&robust=F and =T. However, I find it difficult to interpert the mean-variance relationship plotted using the plotSA function. I don't know what to conclude from the plots. Would you please comment on the plots below(I also included a density plot produced using the plotDensities function from the limma package to double check whether there could be anything wrong going on?)

2) Is normal distribution a requirement for analyzing proteins using the limma pipeline? I understood that limma fits a linear model and there are assumptions made about the data. I am wondering if this should be met to be able to use the package, which brings me to question 3.

3) Is it possible to combine proteins measured using different techniques and are in a different units? I am worried whether their inclusion could influence the results. In my case, I have 365 proteins in the NPX unit measured by olinks. However, I also have other proteins measured using competitive ELISA and would like to include them in the analyisis.I logged 2 transformed the ones measured via ELISA to make them in the same scale as NPX(log2 scale) and to make them normally distributed. Is such approach valid to combine proteins measured differently? Or are there any other requirements that should be met to be able to combine them?

Thank you very much in advance!

2
Entering edit mode

The mean-variance plot shows no trend so you can set trend=FALSE. robust=TRUE does find a few outliers but will probably have only a small effect. Again you could set robust=FALSE.

A normal distribution is not a requirement in the way you think it is. From the evidence you present, limma should work fine.

I don't recommend combining proteins measured by different technologies because the different technologies may have different precisions. Better to analyse them separately. If you feel you must combine them, then run arrayWeights() with var.group=technology to allow for different precisions between the two groups.