moderated p-values limma style from std.error, df, estimate data.
1
0
Entering edit mode
wewolski ▴ 10
@wewolski-8499
Last seen 2.3 years ago
Zurich

Hello,

I do have a data frame my_dataframe with parameter estimates (i.e. linear combinations of the model parameters - contrasts).

"ID" "Contrast" "Estimate" "Std. Error"       "df"    "t-value"     "lower"    "upper"  "Pr(>|t|)"
 1      "A - B"    3.2         ....                                                                            
 2      "A - B"    ....   
 3      "A - B"    ....

for some few hundred subjects_ (ID column) from one experiment, and would like to get the moderated p.values. (But I do not have the option to run the entire limma pipeline with lmFit eBayes computing the contrasts etc...).

Hence I am wondering if it is possible: - to determine the prior parameters $s0$, $d0$ using the function fitFDist from the "Std.Errors" and "df" and than using those update the std.Error by the formula

$\tilde{s^2}= \frac{d0s0^2 + dgsg^2}{d0+dg}$ ?

If this is the case, is it correct that I just need to run the function with(my_dataframe, squeezeVar(Std.Error, df) ) on my_dataframe to obtain the posterior variances and subsequently Estimate/posterior.var to obtain the t-statistics?

Thank you Witek

limma • 688 views
ADD COMMENT
3
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Based on the discussion below, I understand that you have fitted the same linear model to a few hundred separate datasets (with each subject being a separate dataset). To apply limma empirical Bayes moderation you need to have three vectors storing information from the separate fits:

  1. A vector sigma containing the residual standard errors from the separate fits. This vector is not obtainable from your data.frame. The residual standard error is the square root of the residual mean square and is shown in the output from summary(fit) for each linear model. If you save the summary to out <- summary(fit) then the residual standard error is stored as out$sigma.

  2. A vector df.residual containing the residual degrees of freedom for each regression, which will not all be the same some of the fits have missing values. This will be the same as df in your data.frame.

  3. A vector tstat containing the unmoderated t-statistics from the separate linear models. This will be the same as t-value in your data.frame.

Then you can compute moderated t-statistics and p-values by

sv <- squeezeVar(sigma^2, df=df.residual)
moderated.t <- tstat * sigma / sqrt(sv$var.post)
df.total <- df.residual + sv$df.prior
p.value <- 2*pt( abs(moderated.t), df=df.total, lower.tail=FALSE)

Note that the moderation works on residual variances rather than on the standard errors for a coefficient.

ADD COMMENT
0
Entering edit mode

Dear Gordon,

The table is not the table of a single model on one subject but, the result of running the same model on several hundred of subjects and then computing the same contrast on all models. I updated my post to clarify this a bit more. Sorry that it wasn't more precise, I updated it accordingly.

Note that moderation works on residual variances rather than on standard errors, so it needs extra information to that in your data.frame.

This is exactly what I am wondering about, but did not find the time to go through the math yet to understand it better.

What information do I need in the data frame? Just the residual variance or do I also need the design matrices of the model - which might be slightly different for each subject because of missingness in the data?

ADD REPLY
0
Entering edit mode

You need the residual variance (also known as residual standard error or residual mean square) and the residual df.

If you've run the same linear model on all the subjects, and you have the raw data, then it would seem that you could run limma on it directly. What's stopping you from doing that?

ADD REPLY
0
Entering edit mode

You need the residual variance (also known as residual standard error or residual mean square) and the residual df.

These - residual standard error, residual df - I do have for each model. Obviously, I than could then run the squeezeVar function as intended. But then how do I update the std:Error of the contrasts?

ADD REPLY
0
Entering edit mode

I have updated my answer above.

ADD REPLY

Login before adding your answer.

Traffic: 548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6