I am doing DGE over RNASeq data of two types of cancer.
Here is the code:
keep <- filterByExpr(RNA_data, design = design)
RNA_data <- RNA_data[keep,]
RNA_data <- DGEList(counts = RNA_data, genes = rownames(RNA_data))
# Normalize the counts using the TMM method
RNA_data <- calcNormFactors(RNA_data, method = "TMM")
# Create the contrast matrix
cont.matrix <- makeContrasts(disease_leimyo-disease_lipo,
disease_nos-disease_leimyo,
disease_nos-disease_lipo, levels=design)
Voom <- voom(RNA_data, design, plot = FALSE,normalize.method = "quantile")
vfit <- lmFit(Voom, design)
vfit <- contrasts.fit(vfit,cont.matrix)
efit <- eBayes(vfit)
deg <- topTable(efit, coef = 1,adjust.method = 'fdr', number=Inf)
gene_list = deg$logFC
names(gene_list) = deg$genes
gene_list = sort(gene_list, decreasing = TRUE)
head(gene_list)
GO_file = "FILES/c5.go.bp.v2023.1.Hs.symbols.gmt"
res = GSEA(gene_list, GO_file, pval = 0.05)
So as you can see I choose the coef 1 to compare only leiomyo with lipo type and want to analyse the GSEA over these two. I get this error:
> res = GSEA(gene_list, GO_file, pval = 0.05)
Error in build_Anno(TERM2GENE, TERM2NAME) :
argument "TERM2GENE" is missing, with no default
My matrix of RNA_Data has rows gene names such as TP53,MDM2... and colnames the patient IDS. Want can I do to solve this error? Additionally, is this the best way to do GSEA after limma voom? Is the go_file the right one, I want to analyze the highest possible number of pathways.