Question

Controling for Covariates With a Continuous Predictor Variable

0

Entering edit mode

georgii.vdovin • 0

@2289c15f

Last seen 10 weeks ago

Germany

Hello, I am trying to fit natural splines to my data, and I have a question of controling for covariaten. For splines or polynomials I have to treat age as a continuous variable, does the algorithm then assigns groups to my replicate ages; or does it treat all data as one group? How would correction work in that case? I can group the young Wildtype together with the 4.8 y.o.; is that necessary to increase power or again, groups don't matter with age as continous?

Metadata

I have a pretty unbalanced design as the wildtype animals were unique and irreplacable, but they group together on PCA so I have to control for genetic background.

When I try this code DESEq2 doesn't complain, but I still want to be sure.


dds <- DESeqDataSetFromMatrix(countData = counts,
                                  colData = coldata,
                                  design = ~ ns(age_scaled, df = 3) + background)

keep <- rowSums(counts(dds) >= 10) >= 3
dds <- dds[keep,]


dds <- DESeq(dds, test="LRT", reduced = ~ background)
res <- results(dds)

If I am indeed doing it correctly a follow-up question is then about plotting the fitted models, as it introduces these "jumps" in the coordinates and I cannot do a simple geom_line (code shortened):

coef_mat <- coef(dds)
design_mat <- model.matrix(design(dds), colData(dds))

dat <- plotCounts(dds, gene, intgroup = c("age", "sex", "genotype"), returnData = TRUE) %>%
    mutate(logmu = design_mat %*% coef_mat[gene,],
           logcount = log2(count + 1))

ggplot(dat, aes(age, logcount)) +
    geom_point(aes(color = age, shape = genotype), size = 2) +
    geom_line(aes(age, logmu), col="#FF7F00", linewidth = 1.2) +
    labs(
      title = paste(, gene), 
      x = "Age", 
      y = "Log2 expression count", 
      color = "Age",
      shape = "Genotype",
      caption = paste("padj:", formatted_padj)
    )

Edgy plot

I could do geom_smooth, but while that would look good it techically wouldn't directly reflect the fitted model anymore. Thanks a lot in advance.

Deseq DESeq2 • 373 views

ADD COMMENT • link updated 5 months ago by Michael Love 41k • written 5 months ago by georgii.vdovin • 0

score 0 · Answer 1 · 2023-11-07

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 12 hours ago

United States

does the algorithm then assigns groups to my replicate ages; or does it treat all data as one group? How would correction work in that case?

You may want to work this statistical design question out with a local statistician or someone familiar with linear models in R.

ADD COMMENT • link 5 months ago Michael Love 41k