Question: Paired samples in cell lines using DESeq2
gravatar for Puks
13 days ago by
Puks10 wrote:

Hi, I would to use DESeq2 to process three bulk RNASeq paired samples but I am trying to figure out what is the valid model to use here. I used tximport to import Kallisto's transcript-level abundance estimates at gene level to use with deseq2.

In the paired samples, the treatment is overxperssion of gene A. Sample information is as follows:

                    condition patient_id
           BT12CONT   Control        BT1
           BT12OE     OverExp        BT1
           BT53CONT   Control       BT53
           BT53OE     OverExp       BT53
           GBM5CONT   Control       GBM5
           GBM5OE     OverExp       GBM5

I am interested in looking at the condition effect while accounting for sample pairs so I thought a model like the following would be enough:

>   ~ condition + patient_id 

The PCA for these samples shows that the samples separate by patient_id enter image description here

Is this simple model to look at condition/treatment effect enough?

Thanks! Puks

deseq2 • 81 views
ADD COMMENTlink modified 12 days ago by Michael Love24k • written 13 days ago by Puks10

Your samples notably cluster by cell line, not by treatment. Therefore it appears unfortunate to use them as biological replicates. From a biological standpoint this quite normal for cell lines. During cell line establishment there are a lot of things changing inside the cell, particular clones start growing out, the cell might acquire all kinds of alterations that help it grow. Therefore it is not unexpected to see large differences between cell lines (or even between different clones of the same cell line). I do not think this setup is a good choice to get the information you want. You should probably have used the same cell line and perform the overexpression study with this line in a replicated manner. This would give you the power to detect significant changes within the cell line. Comparing these results with the same experiment using the other two cell lines in a replicated fashion then would give you information on how reproducible the findings are from a biological standpoint.

ADD REPLYlink modified 13 days ago • written 13 days ago by ATpoint10

Thanks ATpoint! You are correct, there should have been replicates for each cell line but unfortunately the person who performed the experiment did not do it.

ADD REPLYlink written 12 days ago by Puks10

I have to disagree with ATpoint here. It is actually a good design to use cell lines derived from multiple patients. This assures that the list of differentially expressed genes that OP will find is not specific to one (arbitrarily chosen) patient but has some generality and hence likely to have good overlap with he list one would find if one tried again with different patients.

The fact that the difference between patients is larger than between treatment and control indicates that the treatment has just a small effect: either a small effect on many genes, or a large one on only few genes. If the latter is the case, including "patient_id" in the model will allow to find these genes (because DESeq2 will look at the differences between treatment nd control within each sample pair).

If, however, the treatment causes genes to only change slightly, the experiment is underpowered with just three patients and will return nothing. However, while performing it with many replicates from the same patient will produce many hits, which are maybe not very useful.

ADD REPLYlink written 11 days ago by Simon Anders3.6k
Answer: Paired samples in cell lines using DESeq2
gravatar for Michael Love
12 days ago by
Michael Love24k
United States
Michael Love24k wrote:

Yes, that is the correct model, ~patient + condition (it's good to put condition last in general, see vignette for details).

ADD COMMENTlink written 12 days ago by Michael Love24k

Thanks Michael! I will change the order.

ADD REPLYlink written 12 days ago by Puks10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 343 users visited in the last hour