Subset data within DeSeq results object
1
0
Entering edit mode
aorvedahl • 0
@aorvedahl-13835
Last seen 4.1 years ago

I would like to subset data within a DeSeqResults object but no luck after much searching and experimenting. I've added a column for "Gene_ID", and I also have a separate dataframe that contains a list of gene names in Set_A. I'm able to generate another column, let's call it "Gene_Set", with a qualifier for each gene (with values "Set_A", "Set_B", etc...) using grep function based on the gene names. I'd like to plot only Set_A in an MA plot, and show the significant genes as usual (and want the statistics in DeSeqResults to reflect all sets). I've tried things like:

res.sub <- res[ , res$Gene_ID %in% Set_A ] plotMA(res.sub) This is based on this post which is the closest thing I could find to what I'm trying to do, but unable to cannibalize it for my purposes. DESeq2: multiple conditions design -- How to select subset comparisons from the DESeq object for PCA, ... I also tried a slightly less elegant solution by filtering the DeSeqResults object with a 'merge' function between my subset and the results object, which generates a dataframe, but plotMA(my dataframe) gives me this result: Error in .local(object, ...) : When called with a data.frame, plotMA expects the data frame to have 3 columns, two numeric ones for mean and log fold change, and a logical one for significance.) Lastly, I tried to just color code each set with something like this, but still not working: plotMA(res, ylim=c(-5,5), col = ifelse(res$Gene_set = "Set_A", "red", "blue"))

Error in .local(object, ...) :
argument 4 matches multiple formal arguments

Appreciate any and all help!

Anthony

deseq2 R rna-seq • 3.4k views
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Here's a reference for subsetting data in R.

http://www.statmethods.net/management/subset.html

Generally, it's easy to get help if you provide both the code that you tried, and the errors that resulted.

With this command:

res$Gene_ID %in% Set_A The problem is likely that Set_A is not defined in the environment. As you have described it, "Set_A" is a level in a data.frame. So you would have to use something like this (note you need to change the variables so they match what you are using, here I have df for data.frame): res$Gene_ID %in% df$Gene_ID[df$Set == "Set_A"]