How to judge DEG analysis of these samples could be done by using limma::voom
1
0
Entering edit mode
Yang Shi ▴ 10
@ea61ff7a
Last seen 16 months ago
Zheng Zhou

Dear Bio Communities,

DEG analysis was conducted by limma::voom based on the combined expected count data of TCGA and GTEX. However, the MDS plot show something somewhat unreason which I think. IMO, the heterogeneity of tumor should more obvious than normal tissue. Should I perform DEG analysis based on these sample? How can I judge these samples by MDS plot and what is the criteria? Thanks in advance! enter image description here enter image description here enter image description here

limma Clustering voom RNA-Seq • 1.5k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 10 hours ago
WEHI, Melbourne, Australia

Combining TCGA and GTEx will naturally introduce a large batch effect into the data, which is probably what you are seeing. The first step that most people take is to mark points in the MDS plots by possible explanatory factors such as the source of the data. Then such factors can be included in the linear model to correct for batch effects. Having to correct for batch effects is more the rule than the exception in large-scale RNA-seq analyses.

Note also that you cannot make comparisons between tumor and normal if the tumor all come from TCGA and the normals all come from GTEx. In that case, the batch effect is completely confounded with the comparison you want to make. This has been discussed on Biostars and stackexchange:

Personally I would be reluctant to try to combine TCGA and GTEx. Both of those databases have lots of batch effects individually, even without trying to combine them.

ADD COMMENT
0
Entering edit mode

Thanks for your reply sir! IMO, the batch effect can not be completely removed by arithmetic methods only event though the batch corrected datasets were used (RSEM expected_count,https://xenabrowser.net/datapages/?dataset=TcgaTargetGtex_gene_expected_count&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443). The reason why GTEX normal tissues were utilized for counterpart to TCGA tumor is the limited normal sample size of TCGA. So I wonder is there something I can do to make it more reasonable to use these combined datasets? Like add parameter "normalize.method" of voom function? Thanks too much!

ADD REPLY

Login before adding your answer.

Traffic: 672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6