Question

Limma suqeezeVar

0

Entering edit mode

Mengchun • 0

@a1e3725f

Last seen 9 months ago

United Kingdom

Hi,

I'm going to use limma squeeVar function to estimate protein variance, for the input sample variances and degrees of freedom for the sample variances, I would like to know if I should take missing values into consideration or not. For example, the values in group 1 is 10.5, 11,11.2, NA,NA and in group 2 is 15,15.1,15.5, NA,NA. The df is 2n-2 that is 8. Or I should ignore the NAs, then that would be 4? Thank you!

Best wishes,

Mengchun

limma • 753 views

ADD COMMENT • link 9 months ago Mengchun • 0

score 0 · Answer 1 · 2025-05-19

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

I am not quite clear why you're not using the limma package directly, because it will compute all the variances and the df automatically for you for any high-throughput proteomics dataset.

If you are using squeezeVar() for a different more bespoke research project, then you need to explain the purpose of the study. I can't advise on whether to remove NAs or not without knowing anything about the data or the purposes for which it is being analysed. If you don't remove the NAs, then the variance would obviously simply be NA, so there are no df at all.

ADD COMMENT • link 9 months ago Gordon Smyth 53k

0

Entering edit mode

Hi Gordon,

Thank you for your reply. I'm working on a proteomics data imputation project, I would like to describe the intensity distribution of a certain protein in an experimental group, so the protein's sample variance needs to be estimated. I found squeezeVar() shrinks observed sample variance towards a prior, but I'm not sure whether I should take NAs into consideration when my data looks like this.

group1: 16.56412567 NA NA NA 16.38149395; group2: 16.64612271 NA 16.22489667 NA NA

Best wishes, Mengchun

ADD REPLY • link 9 months ago Mengchun • 0