How do I fix an odd voom plot in a combined dataset?
1
1
Entering edit mode
1sunmic2 • 0
@4b974839
Last seen 2 days ago
Canada

Hi Everyone, I'm having a bit of trouble with my voom normalization as the mean-varience plot looks extremely off. As reference, here is an image: my voom plot which as you can see, looks like a fish

For context, my dataset is a merged dataset, here is my code for my dataset:

    pan_gene_reads$gene <- pan_gene_reads$Name
STARcounts$gene <- STARcounts$Ensembl_ID
Target_gene_exp_count$gene <- Target_gene_exp_count$sample

pan_gene_reads$gene <- gsub("\\.\\d+$", "", pan_gene_reads$gene)  # Remove version numbers
STARcounts$gene <- gsub("\\.\\d+$", "", STARcounts$gene)
Target_gene_exp_count$gene <- gsub("\\.\\d+$", "", Target_gene_exp_count$gene)

combined_data_counts <- merge(pan_gene_reads, STARcounts, by = "gene", all = FALSE)
combined_data_counts <- merge(combined_data_counts, Target_gene_exp_count, by = "gene", all = FALSE)

gene_names_counts <- combined_data_counts[,1:3]
combined_data_counts$gene
combined_data_counts <- combined_data_counts[, -c(1:3)]
combined_data_counts <- combined_data_counts[, -363]
combined_data_counts <- combined_data_counts[, -546]

And here is the code for my voom normalization:

    dge1 <- DGEList(counts = combined_data_counts)

 keep <- rowSums(cpm(dge1) > 1) >= 2
 d1 <- dge1[keep, , keep.lib.sizes = FALSE]
dim(d1)

dge1 <- calcNormFactors(dge1)

dge1$samples$norm.factors


design1 <- model.matrix(~1, data = dge1$samples)


voom_data1 <- voom(d1, design = design1, plot = TRUE)

Is this plot or code bad? If so, how can I fix it? I've tried removing batch effects, which doesn't work since voom doesn't accept negative values, and further filtering, which doesn't change the plot.

limma Normalization voom • 125 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

voom is a DE analysis method rather than a normalization method. voom expects the design matrix to be the same complete design matrix that you will use for the DE analysis, including all your experimental factors as well as any important covariates and batch effects. It is not correct to simply replace the design matrix with an intercept column.

Batch effects are handled by including the batch variables in the design matrix, not by changing the counts.

Having said all that, it is very difficult to handle merged datasets. Have you tried doing a limma-voom analysis on the individual datasets before merging?

ADD COMMENT

Login before adding your answer.

Traffic: 428 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6