Search
Question: DESEq2 for mulitple tissues under 2 conditions
0
gravatar for bioinf
17 days ago by
bioinf0
bioinf0 wrote:

Hi,

I have paired rnaseq data from multiple samples, counted with featureCounts, now planning to use DESeq2 and trying to design it. I have gone through DESEq2 comparison with mulitple cell types under 2 conditions.  However, I would like to confirm if my design is correct or not?

Here is the sample coldata:

    tissue condition
sample1_WA1 WA1 Wild
sample2_WA2 WA2 Wild
sample3_WA3 WA3 Wild
sample4_WB1 WB1 Wild
sample5_WB2 WB2 Wild
sample6_WB3 WB3 Wild
sample7_WC1 WC1 Wild
sample8_WC2 WC2 Wild
sample9_WC3 WC3 Wild
sample10_MA1 MA1 Mutant
sample11_MA2 MA2 Mutant
sample12_MA3 MA3 Mutant
sample13_MB1 MB1 Mutant
sample14_MB2 MB2 Mutant
sample15_MB3 MB3 Mutant
sample16_MC1 MC1 Mutant
sample17_MC2 MC2 Mutant
sample18_MC3 MC3 Mutant
sample19_WE1 WE1 Wild
sample20_WE2 WE2 Wild
sample21_WE3 WE3 Wild
sample22_WD1 WD1 Wild
sample23_WD2 WD2 Wild
sample24_WD3 WD3 Wild

where A,B,C,D,E are five tissue types and D and E are from wild condition only. 1,2 and 3 are biological replicates.

I want to perform:

(i) comparison of differentially expressed genes between all tissue types in wild

(ii) comparison of differentially expressed genes between all tissue types in mutant

(iii) comparison of differentially expressed genes between for all tissue types between wild versus mutant

(iv) comparison of differentially expressed genes between between D and E

How should I setup the design with replicates? Is this correct: 

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ tissue) 
dds <- DESeq(dds)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
Warning message:
In checkForExperimentalReplicates(object, modelMatrix) :
  same number of samples and coefficients to fit,
  estimating dispersion by treating samples as replicates.
  read the ?DESeq section on 'Experiments without replicates'

Please guide Michael Love 

Thanks!

ADD COMMENTlink modified 16 days ago by Michael Love14k • written 17 days ago by bioinf0
1

I'm guess that e.g. WA1 is replicate one of WA.  To get your analysis started, you'll at least need to separate out that using eg tidyr::extract(coldata, c("Tissue","Replicate", "(..)([123])")  . Then you'll probably need to clarify what exactly you mean by "between all tissue types" - do you want a genelist for each pair or tissues, or do you want a single one that lists genes that have at least one tissue that is different from the others; or comparisons against a common 'baseline' tissue.  (iii) is even more ambiguous: I think you'll only be able to use the common tissues, but do you want three lists of tissue-specific mutant vs wt, or genes that have a consistent mutant vs wt effect size across all tissues, or ...

You should also make your replication structure clear.  Is there a 'batch' effect in that WA1 is closely related to WB1 (e.g. from the same individual - possible), and MB1 (unlikely from same individual, but maybe from 'batch 1').  Currently there's not really enough detail in your question to allow us to answer it.

ADD REPLYlink modified 17 days ago • written 17 days ago by Gavin Kelly510

@ Gavin Kelly Yes, you are right WA1 is the replicate 1 of WA.

A,B,C,D,E are five different tissues, while 1, 2 and 3 indicate biological replicates.

Would you please elaborate more on how I can rearrange coldata to make it better accessible for DESeqDataSetFromMatrix ? This is the first time I am going to use DESeq2 and confused with the design step.

  • More details:

(i) First of all, I want a gene list and clustered heatmap in 24 samples at padj < 0.05 and gene names in the heatmap

(ii) Gene list and clustered heatmap (padj < 0.05) between all tissue types in wild (including the replicate information in the heatmap):   WA (1,2,3) vs. WB (1,2,3) vs. WC (1,2,3)

(iii) Gene list and clustered heatmap (padj < 0.05) between all tissue types in mutant (including the replicate information in the heatmap):   NA (1,2,3) vs. NB (1,2,3) vs. NC (1,2,3)

(iv) Gene list and clustered heatmap (padj < 0.05) between all tissue types in wild and mutant (including the replicate information in the heatmap):   WA (1,2,3) vs. WB (1,2,3) vs. WC (1,2,3) vs. NA (1,2,3) vs. NB (1,2,3) vs. NC (1,2,3)

(v) Gene list and clustered heatmap (padj < 0.05) between D and E tissue from wild (including the replicate information in the heatmap):   WD (1,2,3) vs. WE (1,2,3) 

  • WA, WB, WC (and their replicates) belong to pool 1, NA, NB, NC (and their replicates) belong to pool 2  and WD, WE (and their replicates) belong to pool 3. Each pool consists of 10 individuals.
  • Samples run:
Lanes Samples
L1 WA1 WA2 WA3 WB1 WB2 WB3 WC1 WC2 WC3
L2 NA1 NA2 NA3 NB1 NB2 NB3 NC1 NC2 NC3
L3 WD1 WD2 WD3 WE1 WE2 WE3 
ADD REPLYlink modified 16 days ago by Michael Love14k • written 17 days ago by bioinf0
1
gravatar for Michael Love
16 days ago by
Michael Love14k
United States
Michael Love14k wrote:

First you need to clean up your colData so that you have:

  • condition: "wild" or "mutant" (make wild the reference level of condition, see vignette for instructions)
  • tissue: "A", "B", "C", "D", or "E"

I can't exactly tell what you are looking for in (iii), maybe you can restate. For the others, you can use a design of ~tissue + condition, and then simply use results() with an appropriate 'contrast', e.g.:

contrast=c("condition","mutant","wild")

or 

contrast=c("tissue","E","D")
ADD COMMENTlink written 16 days ago by Michael Love14k

So I can use:

 

dds$condition <- factor(dds$condition, levels = c("wild","mutant"))

How can I keep replicate numbers also during analysis and heatmap so that I can see if there is any variation between replicates?

In (iii) I want to compare between different tissue types from mutant.

ADD REPLYlink modified 16 days ago • written 16 days ago by bioinf0

I don't follow, what's the problem. The replicate numbers contribute nothing to the differential expression, unless, are you saying that all the replicate 1's have something to do with each other? A heatmap will allow you to see the variation among replicates. We have example code for heatmaps in our documentation.

"between different tissue types from mutant" : the model I'm suggesting above has a tissue effect which is independent of wild or mutant status. If you want to compare all possible combinations of tissue x condition, you should follow the advice in this first code chunk:

http://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions

 

 

ADD REPLYlink modified 16 days ago • written 16 days ago by Michael Love14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 113 users visited in the last hour