Hello,
I'm analyzing RNA-seq data from two different cell types (ESDM and BMDM) with two treatment conditions (uninfected and infected). My problem is that I'm trying to compare two specific cell types and treatments that aren't in the results table shown after running the DESeq command. This is the code I'm using:
#read .csv file containing sample info read.csv("samples.csv",header=TRUE) #make sample data frame from .csv sample sheet coldata = with(samples, data.frame(shortname = I(shortname), countf = I(countf), cell_type = cell_type, biol_rep = biol_rep, treatment = treatment)) #make DESeq2 dataset object dds = DESeqDataSetFromHTSeqCount(sampleTable = coldata, directory = "/datadirectory/mydata", design = ~ cell_type + treatment + cell_type:treatment) #collapse technical replicates ddscollapsed <- collapseReplicates ( dds, groupby = dds$biol_rep) #relevel samples so that uninfected, BMDM are control samples ddscollapsed$cell_type <- relevel (ddscollapsed$cell_type, "BMDM") ddscollapsed$treatment <- relevel (ddscollapsed$treatment, "uninfected") #do differential expression analysis ddscollapsed <- DESeq(ddscollapsed) #final data frame: > as.data.frame(colData(ddscollapsed)) cell_type biol_rep treatment sizeFactor BMDM_I_1 BMDM BMDM_I_1 infected 0.9241779 BMDM_I_2 BMDM BMDM_I_2 infected 0.6117723 BMDM_I_3 BMDM BMDM_I_3 infected 0.3563606 BMDM_I_4 BMDM BMDM_I_4 infected 0.8114125 BMDM_U_1 BMDM BMDM_U_1 uninfected 0.8375975 BMDM_U_2 BMDM BMDM_U_2 uninfected 2.2902482 BMDM_U_3 BMDM BMDM_U_3 uninfected 1.9502123 BMDM_U_4 BMDM BMDM_U_4 uninfected 2.1769115 ESDM_I_1 ESDM ESDM_I_1 infected 0.4880022 ESDM_I_2 ESDM ESDM_I_2 infected 0.8463753 ESDM_I_3 ESDM ESDM_I_3 infected 1.1217204 ESDM_I_4 ESDM ESDM_I_4 infected 0.6988828 ESDM_U_1 ESDM ESDM_U_1 uninfected 1.7999121 ESDM_U_2 ESDM ESDM_U_2 uninfected 0.7995184 ESDM_U_3 ESDM ESDM_U_3 uninfected 1.6323795 ESDM_U_4 ESDM ESDM_U_4 uninfected 1.0969576 #display what the different results names are resultsNames(ddscollapsed) #At this point, I get the following: > resultsNames(ddscollapsed) [1] "Intercept" "cell_type_ESDM_vs_BMDM" [3] "treatment_infected_vs_uninfected" "cell_typeESDM.treatmentinfected"
What I would like to be able to do is identify genes that are differentially expressed ONLY between infected ESDM cells and infected BMDM cells.
I've also tried the design formula "~ cell_type + treatment " and get:
> resultsNames(ddscollapsed) [1] "Intercept" "cell_typeBMDM" "cell_typeESDM" [4] "treatmentuninfected" "treatmentinfected"
Here I've tried to implement the contrast list function, but I'm unsure of how to make it accept two terms for either the numerator or the denominator. I would use the matrix (numeric function) to input this data, but I'm unsure of how to indicate that "treatmentinfected" should be both +1 and -1.
Could you please let me know what to do so that I can make specific comparisons between samples that have two conditions and two groups?
Thanks for your input.
Erin
Hi Erin,
I have read your post and since it is the same situation of my experiment, so I'm wondering if now you have clearer idea than initial one after the Simon and Michael' s suggestion for the statistical analysis of your data with DESeq2.
I will appreciate if you will share with me the sequence of DESeq code you have used to answer your question:
"What I would like to be able to do is identify genes that are differentially expressed ONLY between infected ESDM cells and infected BMDM cells"
thanks
Annalisa