Search
Question: Strong upwards correlation between log fold change and average log expression in MD plot of Limma
0
gravatar for gil.hornung
11 days ago by
gil.hornung0 wrote:

Hi,

I am trying to analyse my proteomics data (intensity based label free quantification) using Limma.

When I look at the MD plot after limma fit, I see a strong correlation between log-fold-change and average-log-expression. What could cause this odd behaviour and how can I fix it?

Thanks,

Gil

plotmd

The code I used is:

> imat <- read.delim("matrix.tsv", sep = "\t", row.names = 1) %>% 
+   as.matrix()
> head(imat)
          X51      X56      X57      X61      X65      X67      X72      X74      X77      X83      X85      X88
7966  1670300  1057800  1261700  1319300  1273000  1371400   950630  1920300   656870   706280  1157600   860140
7580  2727900  1536000  2675700  2168800  2246800  2530700  7548400  3874900  1761800  2203100  2815500  1867300
5068 27261000 19086000 19027000 16980000 24905000 29381000 27716000 24395000 15292000 16111000 13618000 14989000
4178   124790   372080        0   240420        0        0   492620   495800  2050400   408840  5791100  4327800
1038  1076100   962340  1384800        0  1395300        0   910440        0   714210        0        0   465820
1117        0   154830   159230        0        0        0   221510        0        0   133880   243150   881710
> sample_info <- read.delim("sample_desc.tsv",sep = "\t",stringsAsFactors = T)
> head(sample_info)
  Sample Group
1    X51     A
2    X56     A
3    X57     A
4    X61     A
5    X65     A
6    X67     A
> 
> design <- model.matrix(~0+sample_info$Group)
> colnames(design) <- levels(sample_info$Group)
> 
> limma_fit = lmFit(log2(imat+1), design = design)
> plotMD(limma_fit)

 

session info:

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.8 (Santiago)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ecoliLeucine_1.10.0  ecolicdf_2.18.0      affy_1.48.0          Biobase_2.30.0       BiocGenerics_0.16.1  BiocInstaller_1.20.3
 [7] plotly_4.5.6         ggdendro_0.1-20      gridExtra_2.2.1      purrr_0.2.2          ggplot2_2.2.1        dplyr_0.5.0         
[13] tidyr_0.6.0          tibble_1.2           limma_3.26.9         knitr_1.15.1        

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2      amap_0.8-14           htmltools_0.3.5       stats4_3.2.1          viridisLite_0.1.3     yaml_2.1.14          
 [7] base64enc_0.1-3       DBI_0.6               bit64_0.9-5           RColorBrewer_1.1-2    affyio_1.40.0         matrixStats_0.51.0   
[13] plyr_1.8.4            stringr_1.2.0         zlibbioc_1.16.0       munsell_0.4.3         gtable_0.2.0          htmlwidgets_0.8      
[19] memoise_1.0.0         evaluate_0.10         labeling_0.3          IRanges_2.4.8         AnnotationDbi_1.32.3  preprocessCore_1.32.0
[25] highr_0.6             Rcpp_0.12.10          edgeR_3.12.1          scales_0.4.1          S4Vectors_0.8.11      jsonlite_1.0         
[31] bit_1.1-12            digest_0.6.12         stringi_1.1.2         grid_3.2.1            tools_3.2.1           magrittr_1.5         
[37] lazyeval_0.2.0        RSQLite_1.1-2         MASS_7.3-45           data.table_1.10.4     assertthat_0.1        httr_1.2.1           
[43] R6_2.2.0
ADD COMMENTlink modified 10 days ago by Gordon Smyth32k • written 11 days ago by gil.hornung0
3
gravatar for James W. MacDonald
11 days ago by
United States
James W. MacDonald45k wrote:

From ?plotMD, specifically the Details section

     If 'object' is an 'MArrayLM' object, then the plot is an fitted
     model MD-plot in which the estimated coefficient is on the y-axis
     and the average A-value is on the x-axis.

Since you have fit a cell means model (e.g., your coefficients estimate the mean of each group), what you are plotting as the 'logFC' is actually just the mean of the group, rather than the fold change. You need instead to either define a contrasts matrix and use contrasts.fit to compute the contrasts, or use a design with an intercept so your coefficients are already inherently contrasts.

ADD COMMENTlink written 11 days ago by James W. MacDonald45k
2
gravatar for Gordon Smyth
10 days ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

Just to reinforce James' answer, the problem is that you haven't actually computed any log-fold-changes yet. So you're just plotting one mean against another.

Why did you include the "0+" in your model? Presumably you got that from reading a suggested analysis somewhere. If so, then go back to to the analysis and read what the next step was. If you don't already have some advice to read, then try Section 9.2 of the limma User's Guide.

ADD COMMENTlink modified 10 days ago • written 10 days ago by Gordon Smyth32k

Thank you Gordon and James for the fast and helpful answers.

I used "1+" in my model and it worked fine.

 

Al the best ,

 

Gil

ADD REPLYlink written 8 days ago by gil.hornung0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 203 users visited in the last hour