Question

How to judge DEG analysis of these samples could be done by using limma::voom

0

Entering edit mode

Yang Shi ▴ 10

@ea61ff7a

Last seen 13 months ago

Zheng Zhou

Dear Bio Communities,

DEG analysis was conducted by limma::voom based on the combined expected count data of TCGA and GTEX. However, the MDS plot show something somewhat unreason which I think. IMO, the heterogeneity of tumor should more obvious than normal tissue. Should I perform DEG analysis based on these sample? How can I judge these samples by MDS plot and what is the criteria? Thanks in advance! enter image description here

limma Clustering voom RNA-Seq • 2.1k views

ADD COMMENT • link 3.5 years ago Yang Shi ▴ 10

score 2 · Accepted Answer · 2022-07-24

Combining TCGA and GTEx will naturally introduce a large batch effect into the data, which is probably what you are seeing. The first step that most people take is to mark points in the MDS plots by possible explanatory factors such as the source of the data. Then such factors can be included in the linear model to correct for batch effects. Having to correct for batch effects is more the rule than the exception in large-scale RNA-seq analyses.

Note also that you cannot make comparisons between tumor and normal if the tumor all come from TCGA and the normals all come from GTEx. In that case, the batch effect is completely confounded with the comparison you want to make. This has been discussed on Biostars and stackexchange:

Personally I would be reluctant to try to combine TCGA and GTEx. Both of those databases have lots of batch effects individually, even without trying to combine them.