Question

RNA-Seq normalization for co-expression analysis

0

Entering edit mode

Lin ▴ 50

@lin-19103

Last seen 5.1 years ago

Hi all,

it is my first time to work with RNA-seq data, and with this data a differential expression analysis and co-expression network analysis should be done. Now I read in the pipeline [RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR] for differential expression analysis that at first the counts are filtered (CPM), and then normlization is done with TMM. However, for the co-expression analysis I would like to use normalized data but another filtering method. So my question is: Can I also apply the TMM normalization method to unfiltered data, and then filter the normalized data afterwards? Or do you see any problem with this/have other suggestions?

Thanks in advance!

edger WGCNA limma • 2.7k views

ADD COMMENT • link updated 6.3 years ago by Gordon Smyth 53k • written 6.3 years ago by Lin ▴ 50

0

Entering edit mode

You should take a look at point 4 in the WGCNA FAQ: https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/faq.html

ADD REPLY • link 6.3 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

Hi Kevin, thanks for your answer! But my question was rather if there is a problem using TMM normalization FIRST (with unfiltered data), and filter the data afterwards (because I want to use another pipeline where the whole filtering is implemented).

ADD REPLY • link 6.3 years ago Lin ▴ 50

1

Entering edit mode

I see. Gordon has already answered. Based on your logic, you have CPM counts, and then you apply TMM to those?

ADD REPLY • link 6.3 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

Yes, exactly, that would be what I thought of... Because I need normalized data, but would like to use another filtering procedure during the co-expression analysis... And with this filter I would loose too many transcripts before (and would double-filter).

ADD REPLY • link 6.3 years ago Lin ▴ 50

0

Entering edit mode

Typically, we filter the raw counts, then normalise, and then make statistical inferences on the normalised counts. After that, we may apply a further transformation on the normalised counts for the purposes of conducting downstream analyses.

ADD REPLY • link 6.3 years ago Kevin Blighe ★ 4.0k

score 1 · Answer 1 · 2019-07-14

1

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Yes, you can apply TMM to the unfiltered counts, although it is not quite a robust as applying it to the filtered counts.

But filtering out genes that are not expressed to a meaningful degree in any sample would still be sensible as a first step, as this is needed by both co-expression analyses and edgeR.

ADD COMMENT • link 6.3 years ago Gordon Smyth 53k