EdgeR for methylation analysis when there is no replicates
1
0
Entering edit mode
annasoudi • 0
@b4dd6e51
Last seen 3 months ago
United States

Hello, Thanks for the awesome tool. We are using edgeR for the RRBS for differential methylation analysis on multi regional tumors (multiple regions collected from each tumour). One of our purpose is to find out if there is any different pattern of methylation between the regions in a tumour in a particular tumour. So in this case I do not have any replicate (I have one library for regionA and one library for regionB in a tumour) I see you have some suggestions to run EdgeR when there is no replicate but I am wondering how it should be performed for the methylation data, not gene expression? For example what would be ideal bcc value?

I managed to create the following list. I thought I can assign group 1 to region A and group 2 to region B and do exactTest(y, dispersion=bcv^2) But I do not think it is correct because I h ave methylated and unmethylated read counts for each region.

I really appreciate if you could help me how I can proceed to find out if there is any difference in methylation between two or more regions in a tumour when there is no replicate (one library per region)? Thanks

Enter the body of text here

Code should be placed in three backticks as shown below


> y
An object of class "DGEList"
$counts
        SRC159-T1-Me SRC159-T1-Un SRC159-T2-Me SRC159-T2-Un
1-10525           55            9           23            9
1-10526           72           21           61            7
1-10542           33           31           20           12
1-10543           44           49           51           17
1-10563           30           34           17           15
9470247 more rows ...

$samples
             group lib.size norm.factors
SRC159-T1-Me     1 57103934            1
SRC159-T1-Un     1 44889934            1
SRC159-T2-Me     2 33037087            1
SRC159-T2-Un     2 22688158            1

$genes
        Chr Locus
1-10525   1 10525
1-10526   1 10526
1-10542   1 10542
1-10543   1 10543
1-10563   1 10563
9470247 more rows ...

sessionInfo( )
RRBSdata edgeR • 330 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 11 minutes ago
WEHI, Melbourne, Australia

The correct way to analyse methylation data is explained in this article https://f1000research.com/articles/6-2055/v2 and in the edgeR User's Guide. You must use glmFit and glmLRT rather than exactTest.

Our advice for analysing data without replicates in the same for RRBS BS-seq as it is for RNA-seq and is explained in the edgeR User's Guide. You need to input a preset value for the dispersion but I cannot tell you what that value should be, it depends on your previous experience with the same sort of data. If you have no idea, then try the values suggested in the User's Guide. Or try treating your two samples as a single group with n=2 and see what dispersion value you get. Then use that for the real analysis.

ADD COMMENT
0
Entering edit mode

Thanks for your prompt reply. OK. I see your point regarding using glmFit and glmLRT. My main confusion is how assigning groups to samples! As I mentioned in my example above, for each region we have methylated and unmethylated reads. I grouped all reads belonging to region1 (methylated or unmethylayed ) into group1 and all reads belonging to region2 (methylated or unmethylayed ) into group2. I am not sure if this is the right approach?

ADD REPLY
0
Entering edit mode

The correct approach for assigning samples to groups is completely laid out in the documentation that I refered you to. The methylated and unmethylated reads for each sample are treated as paired in the analysis. The section called "a very small example" in the article I cite above considers in detail a small dataset without replicates exactly like yours. You can follow the example almost exactly.

ADD REPLY

Login before adding your answer.

Traffic: 571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6