How to get DESeq2 results for a list of genes
1
1
Entering edit mode
Lucía ▴ 10
@16997962
Last seen 21 days ago

Hi,

So I ran DESeq2 and I want to extract the results for a specific list of genes. This is my code, and this is what I did to get the DEGs that are statistically significant. However, I now want to specifically pull out the results for my set of genes of interest.

# AGA

dir <- "aga_counts"

#  file.path constructs filepath with the correct separator dependent of OS

#  paste0 appends (i.e. adds together) character objects
files <- paste0(file.path(dir, sample_metadata_aga$sample), ".this.htseq.out") files file.exists(files) all(file.exists(files)) # Running DESeq2 # Creates a new data table with the variables sampleName, fileName and condition sample_df_aga <- data.frame(sampleName = sample_metadata_aga$sample,
fileName = files,
condition = sample_metadata_aga$treatment) ddsHTSeq_aga <- DESeqDataSetFromHTSeqCount(sampleTable = sample_df_aga, design = ~ condition) # Set control condition using the relevel function ddsHTSeq_aga$condition <- relevel(ddsHTSeq_aga$condition, ref = "control") # Run DESeq2 dds_aga <- DESeq(ddsHTSeq_aga) # Filtering and data export resultsNames(dds_aga) res_aga <- results(dds_aga, name = "condition_infected_vs_control") head(res_aga) # Subset differentially expressed genes significant_res_aga <- subset(res_aga, padj < 0.05) write.csv(as.data.frame(significant_res_aga), file = "treat_vs_control_aga.csv")  This is what my list of genes that I want to extract the results for looks like: LOC115725515 LOC115725528 LOC115725529 LOC115725530 LOC115725532 LOC115725557 LOC115725562 LOC115725564 LOC115725566 LOC115725588 LOC115725594 LOC115725604 LOC115725608 LOC115725611 LOC115725612 LOC115725616 LOC115725636 LOC115725669  and so on Thanks! DESeq2 • 337 views ADD COMMENT 0 Entering edit mode ATpoint ★ 1.4k @atpoint-13662 Last seen 22 hours ago Germany Standard rules apply for filtering in R. If the genes are the rownames of the results object res then: res[my_ges,]  or if the column was called gene res[res$gene %in% my_gene,]


or tidyverse

res %>% filter(gene %in% mygene)

0
Entering edit mode

Hi, thank you!

Unfortunately that doesn't work. The first option gives me this error:

# Extracting for list
res_list <- res_inf[my_genes,]
Error in (function (classes, fdef, mtable)  :
unable to find an inherited method for function ‘NSBS’ for signature ‘"data.frame"’


where res_inf is my results object.

And I cannot use the second and third options because oddly enough the results object doesn't have a column for gene names. The rows are the gene names. This is what the results object looks like:

Log2 fold change (MLE): condition AGA vs PK
Wald test p-value: condition AGA vs PK
DataFrame with 29777 rows and 6 columns
baseMean log2FoldChange     lfcSE      stat    pvalue      padj
<numeric>      <numeric> <numeric> <numeric> <numeric> <numeric>
A5N79_gr01  20574.31161      -1.175559  0.892386 -1.317322  0.187731  0.332645
A5N79_gr02      0.43524       1.932421  3.999297  0.483190  0.628961        NA
A5N79_gr03   4339.88102      -0.582626  0.713301 -0.816801  0.414042  0.573369
A5N79_gt01      0.00000             NA        NA        NA        NA        NA
A5N79_gt02      0.00000             NA        NA        NA        NA        NA
...                 ...            ...       ...       ...       ...       ...
TRNAY-GUA_5     2.64171      -0.410521   1.83229 -0.224048   0.82272  0.890854
TRNAY-GUA_6     0.00000             NA        NA        NA        NA        NA
TRNAY-GUA_7     0.00000             NA        NA        NA        NA        NA
TRNAY-GUA_8     0.00000             NA        NA        NA        NA        NA
TRNAY-GUA_9     0.00000             NA        NA        NA        NA        NA

1
Entering edit mode

That's a problem related to your lack of R basics, neither DESeq2 or Bioconductor software/technical problem. Please learn R basics first. my_genes must be a vector of gene names for this filtering to work. So here, if my_genes is a data.frame you have to select the column with the genes first. Something like my_genes\$genes depending on how the gene column is called.

0
Entering edit mode

Wow, it costs zero dollars (or whatever currency you use) to be kind. It is not necessary nor productive to be unkind or shame users who are newer to bioinformatics, asking questions is literally how people learn. If the question irks you, don’t answer it, but the purpose of this forum is to ask questions (however basic you may deem them). Please reflect on this, because trying to make people feel unintelligent on a support forum ain’t it.

0
Entering edit mode

Wow, I give you a full solution including explanation of basics and why the error you see even come up, and you still complain. Are there not enough please in my comments, or is it just that people these days expect strangers on the internet to provide solutions with a lot of icing on top? I really don't get this attitude, it's beyond me how after receiving help that fully solves your most basic problem including background information on the error you can even post a comment like this.

0
Entering edit mode

I did thank you, I just gave you some feedback in addition to that, which btw is not the same as "complaining". I didn't even bring this up for my sake, but for everyone else's. Comments like yours can be harmful and deter people from learning. Again, shaming people has been proven not to be a good strategy to get people to learn, in fact it does the opposite. I am, of course, grateful for your help (like I stated in the first comment); however, all that shade was unnecessary, and btw not very subtle in your last comment. You know what is most basic? Thinking that knowing the answer to something gives you the right to look down on others.

0
Entering edit mode

Just to weigh in here, ATpoint does volunteer a lot of time helping new users on the Bioconductor support site, and I don't see his post above as unkind. Maybe slightly terse, but he does give you some pointers and asks that you please learn R basics.

There is some missing context perhaps which is that the Bioconductor support site is primarily designed for users to receive help for using Bioconductor packages in particular. At some point, questions which only involve Bioconductor tangentially, e.g. how to subset a table in R, make less sense on the Bioconductor support site vs. other settings for newcomers to R to learn how to work with R. There are also lots of online tutorials that may help bridge the gap. Maybe we can do a better job pointing users who are new to R to those resources.

Something special about the Bioconductor support site (and unlike other online forums) is that when you post, the site sends an email to the developers of the method (who are required to register their email when submitting a package to Bioconductor), so if every user coming to learn how to use R immediately posted to the support site, that would be an extra burden on developers, beyond providing technical package-specific support.

0
Entering edit mode

¯\_(ツ)_/¯

I think it's also important that people know this is a safe space to post. I don't regret stating that.