Question

Sample quality weights for RNA-seq

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.0 years ago

United States

My RNA-seq data is highly nested and unbalanced: plants within genotype within family with different numbers of plants within each genotype and different numbers of genotypes within each family. When I did the analysis using contrasts, I simply averaged the genotype means within family, which was the equivalent of using a sample weight: 1/(n_P.g * n_g.F) where n_P.g is the number of plants (samples) within a genotype and and n_g.F is the number of genotypes in the family.

Now I want to use a continuous predictor (one value per plant) instead of contrasts. However, I still think that I need to weighted analysis to deal with the nesting and imbalance.

1) In LIMMA can I use the sample quality weights to impose this weighting? If so, do I also use the voom weights?

2) Is there a way to do this in DESeq2 with the GLRT?

--Naomi

DESeq2 limma weight • 1.8k views

ADD COMMENT • link 4.0 years ago Naomi Altman ★ 6.0k

score 0 · Answer 1 · 2021-04-15

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 4 days ago

United States

We don't have contrasts implemented for the LRT in DESeq2, only for the Wald test (in results(), the contrast argument can be a numeric vector and will generate a new Wald statistic).

ADD COMMENT • link 4.0 years ago Michael Love 43k

0

Entering edit mode

I am using the LRT in DESEeq2 because I now have a continuous predictor. The question is whether there are some types of sample weights that I can use. Thanks.

ADD REPLY • link 4.0 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

We don't fit sample weights but you can provide them as:

assays(dds)[["weights"]] <- wts # matrix of weights, ngene x nsample

And they will be used in dispersion and GLM coefficient estimation.

ADD REPLY • link 4.0 years ago Michael Love 43k

0

Entering edit mode

Thanks Michael. I'll try that. (Limma folks, I'd still love to hear from you.)

ADD REPLY • link 4.0 years ago Naomi Altman ★ 6.0k

score 0 · Answer 2 · 2021-04-15

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 1 day ago

WEHI, Melbourne, Australia

Naomi, I don't really follow your experiment or why a continous variable causes a problem, but I think you already know how to use weights, contrasts and continuous covariates in limma. If necessary you can make the sample weights a function of a continuous variable in limma using the var.design argument of arrayWeights or voomLmFit. I'm not sure what you have in mind, but I don't think it is correct to set weight equal to the number of repeated measures samples being averaged, if that's what you are thinking of, because the higher level variance is not being averaged.

ADD COMMENT • link 4.0 years ago Gordon Smyth 52k

0

Entering edit mode

Thanks Gordon. I know how to use contrasts and continuous variables. I have only used weights at the gene within sample level (i.e. what used to be spot quality weights). If sample weights can be used in the usual way, I will proceed. Since plants within genotype and genotype within family have different variances, I think I need to rethink the weights. Thanks for pointing this out.

ADD REPLY • link 4.0 years ago Naomi Altman ★ 6.0k