Hi, everyone I am doing differential expression analysis using DESeq2. total genes that are differentially expressed: 8126 out of 60663 genes. Now, I want to lower these numbers by filtering out more genes. I want to find out about 2000 genes that are statistically more differentially expressed. Please help me regarding this concern. Below are the codes that I used
# hbv <- read.csv("hbv.csv")
> dim(hbv)
[1] 60664 43
> library(DESeq2)
> set.seed(1)
> colData <- DataFrame(condition=factor(c("ctrl", "treat", "ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat", "ctrl", "treat", "ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat","ctrl", "treat")))
> countDataMatrix <- as.matrix(hbv[ , -1])
> rownames(countDataMatrix) <- hbv[ , 1]
> dds <- DESeqDataSetFromMatrix(countDataMatrix, colData,formula(~condition))
> dds <- DESeq(dds)
> res <- results(dds)
> resOrdered <- res[order(res$padj),]
> sig <- resOrdered[!is.na(resOrdered$padj) & resOrdered$padj<0.10 & abs(resOrdered$log2FoldChange)>=1,]
> dim(sig)
[1] 8126 6
Thank you very much
But one thing I want to ask, why I can't set my experimental labels like this :
If I don't have a meta file, then is it wrong to create experimental labels comparing two conditions like this?
You can do it this way; however, you have to be 100% certain that this metadata is aligned perfectly to the expression data.