BCV increases with an increasing counts per million in RNAseq (edgeR)
1
0
Entering edit mode
gowtham ▴ 210
@gowtham-5301
Last seen 10.2 years ago
Hi Everyone, I analyse my current RNAseq data set (two groups; each group with two replicates) using classic edgeR. I see couple strange results that i am trying to make sense of. I really appreciate any help from the list. 1) after filtering out tags for low reads (minimum of 1 cpm in each of 4 samples:dge[rowSums((cpm.dge > 1)) >=4, ]) and normalizing (calcNormFactors), i create the BCV plot (attached:norm_filt_bcv.png). I see CV going up along with CPM. But, when I dont filter and dont normalize i see a traditional BCV plot (attached: nonorm_nofilt_bcv.png). Any idea why this is the case? Especially, the normalization factors are close to 1. (0.9747020 , 0.9756064, 0.9769463, 1.0764226) and filtering for all samples with minimum of 1 CPM removed only 800 genes out of 8000 genes. 2) Most of the genes seems to have dispersion lower than common dispersion. Aren't they supposed to be distributed on either side (which is the case with nofilt-nonorm). 2) Similarly, I see a different MDS plot for both filtered (and normalized) and unfiltered (non-normalized) datasets (attached). Wondering what is going on? Any suggestion/comments will be very helpful. Thanks a lot in advance, Gowthaman PS: The calculated common dispersion is rather high. Disp = 0.14757 , BCV = 0.3841 -- Gowthaman Bioinformatics Systems Programmer. SBRI, 307 West lake Ave N Suite 500 Seattle, WA. 98109-5219 Phone : LAB 206-256-7188 (direct). -------------- next part -------------- A non-text attachment was scrubbed... Name: nonorm_nofilt_bcv.png Type: image/png Size: 57366 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20120615="" 90ee9a92="" attachment.png=""> -------------- next part -------------- A non-text attachment was scrubbed... Name: norm_filt_bcv.png Type: image/png Size: 54540 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20120615="" 90ee9a92="" attachment-0001.png=""> -------------- next part -------------- A non-text attachment was scrubbed... Name: nonorm_nofilt_mds.png Type: image/png Size: 12447 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20120615="" 90ee9a92="" attachment-0002.png=""> -------------- next part -------------- A non-text attachment was scrubbed... Name: norm_filt_mds.png Type: image/png Size: 11921 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20120615="" 90ee9a92="" attachment-0003.png="">
RNASeq Normalization edgeR RNASeq Normalization edgeR • 1.2k views
ADD COMMENT
0
Entering edit mode
gowtham ▴ 210
@gowtham-5301
Last seen 10.2 years ago
Hi All, I am resending this email with fewer figures (in .gif format) as my previous message was held for moderation (due to its size). And i dont seem to cancel the message either (the link was broken). Sorry about this. #---# Hi Everyone, I analyse my current RNAseq data set (two groups; each group with two replicates) using classic edgeR. I see couple strange results that i am trying to make sense of. I really appreciate any help from the list. 1) after filtering out tags for low reads (minimum of 1 cpm in each of 4 samples:dge[rowSums((cpm.dge > 1)) >=4, ]) and normalizing (calcNormFactors), i create the BCV plot (attached:norm_filt_bcv.png). I see CV going up along with CPM. But, when I dont filter and dont normalize i see a traditional BCV plot (attached: nonorm_nofilt_bcv.png). Any idea why this is the case? Especially, the normalization factors are close to 1. (0.9747020 , 0.9756064, 0.9769463, 1.0764226) and filtering for all samples with minimum of 1 CPM removed only 800 genes out of 8000 genes. 2) Most of the genes seems to have dispersion lower than common dispersion. Aren't they supposed to be distributed on either side (which is the case with nofilt-nonorm). 2) Similarly, I see a different MDS plot for both filtered (and normalized) and unfiltered (non-normalized) datasets (attached). Wondering what is going on? Any suggestion/comments will be very helpful. Thanks a lot in advance, Gowthaman PS: The calculated common dispersion is rather high. Disp = 0.14757 , BCV = 0.3841 -- Gowthaman Bioinformatics Systems Programmer. SBRI, 307 West lake Ave N Suite 500 Seattle, WA. 98109-5219 Phone : LAB 206-256-7188 (direct).
ADD COMMENT

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6