Question: About experimental design of RNA-seq
gravatar for xie186
2.6 years ago by
xie1860 wrote:

I have three time courses RNA-seq data, each with 7 time points. Let's say: A0-A6, B0-B6 and C0-C6. Group A is the control. Group B and C are the treatment. Experiments were practiced in different chamber. A0, B0 and C0 should have the same expression pattern because there was no treatment. B1 and C1 should have the same expression pattern because at this time point they have the same treatment. 

I extracted the FPKM value for each sample and constructed the co-expression gene network. I found that for some of the modules. A0, B0 and C0 have distinct expression patterns. B1 and C1 also have distinct expression patterns. 

So I think probably there are con-founding factors I need to remove. What I think of is 1) A, B and C were in different chambers; 2) when constructing the RNA-seq library, they were constructed in different time or by different technician. 

How can I to remove the con-founding factors to keep A0, B0 and C0 have similar expression pattern and B1 and C1 have similar expression pattern? I'm going to use DESeq2 to identify DEGs.

Can anyone give me some suggestions for the data analysis. Thanks. 





ADD COMMENTlink modified 2.6 years ago by Michael Love26k • written 2.6 years ago by xie1860


If you have a DESeq2 specific question, feel free to make a new post. I'm removing the DESeq2 tag from this post.

ADD REPLYlink written 2.6 years ago by Michael Love26k

Did you send these samples for sequencing in one batch? If so, it seems unlikely that they would be prepared at different times.

How 'distinct' are the expression patterns? Even a simple R^2 value might help quantify it.

It sounds to me like the variation between A0, B0 and C0 that you're observing is due to biological variation. If that's the case, you need to use a method that takes this variation into account when looking for DE genes, rather than trying to force the zero time points to look the same.

DESeq2 will do this fine, as long as you construct the design matrix properly. EdgeR also supports this - look at the user guide because it has a time series example very similar to your own.

ADD REPLYlink written 2.6 years ago by gabriel.rosser20

Be cautious about building the networks, it is not recommended to use WGCNA with less than 20 samples and 15 would be the minimum... That would mean you need at least 315 sequenced samples.

ADD REPLYlink written 2.6 years ago by Lluís Revilla Sancho530
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 194 users visited in the last hour