Question

CNV parameter not working - MEDIPS

0

Entering edit mode

anupriyaverma1408 • 0

@anupriyaverma1408-13011

Last seen 7.7 years ago

I ran following sets of commands for getting differential methylation profiles & CNVs :

library(MEDIPS)
library(BSgenome.Mmusculus.UCSC.mm9)

BSgenome="BSgenome.Mmusculus.UCSC.mm9"

uniq=1e-3

extend=300

shift=0

ws=100

WT_Input = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/3-WT-Input_S3.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)

WT_5mC = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/1-WT-5mC_S1.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)

LshKO_Input = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/4-LshKO-Input_S4.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T)

LshKO_5mC = MEDIPS.createSet(file = "/Medip-seq_project/Alignment_BAMs_and_BEDs/2-LshKO-5mC_S2.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select=NULL, paired = T) #chr.select=NULL doesn't affect further steps.

CS1 = MEDIPS.couplingVector(pattern = "CG", refObj = WT_Input)

mr.edgeR = MEDIPS.meth(MSet1 = WT_5mC, MSet2 = LshKO_5mC, CSet = CS1, ISet1 = WT_Input , ISet2 = LshKO_Input, p.adj = "bonferroni", diff.method = "edgeR", MeDIP = T, CNV = T, minRowSum = 10)

I got this error :

Creating results table...
Adding differential coverage results...
Adding CNV results...
Error in cnv.combined[i, 1]:cnv.combined[i, 2] : NA/NaN argument
In addition: Warning messages:
1: In MEDIPS.rms(MSet1[[i]], CSet, ccObj = ccObj) : NaNs produced
2: In MEDIPS.rms(MSet2[[i]], CSet, ccObj = ccObj) : NaNs produced

What does this error means? If I run function MEDIPS.meth with "CNV = F" jsut to get differntial methylation profiles then it's coming fine. But how to get CNVs?

Please help me out. Thank you :)

medips medip-seq error lukas • 2.4k views

ADD COMMENT • link updated 8.7 years ago by Matthias Lienhard ▴ 280 • written 8.7 years ago by anupriyaverma1408 • 0

score 1 · Answer 1 · 2017-05-15

1

Entering edit mode

Matthias Lienhard ▴ 280

@matthias-lienhard-6292

Last seen 2.0 years ago

Max Planck Institute for molecular Gene…

Hi Anupriya Verma,

MEDIPS.createSet calculates CNVs based on the window size selected for the primary MeDIP seq analysis. In most cases, the input coverage is not sufficient for CNV detection at this resolution. Maybe, that this somehow leads to the error message you got. The suggested way would be to set CNV=F in this step, and then add CNV information using MEDIPS.addCNV. This function allows the user to specify a different window size. See also the notes on the CNV parameter at the MEDIPS.meth help page

> ?MEDIPS.meth

Furthermore, the error might be somehow related to the nonchoromosomal conigs in the mm9 reference. You could restrict the analysis on the chromosomes, e.g. by setting

> chr.select=paste0("chr",c(1:19,"X",Y"))

By the way, our new analysis package "QSEA" provides far more options to estimate and deal with CNV (among other benefits). In particular, it allows using the CNV within the normalization, and thereby considers CNV information in the detection of differentially methylated regions, rather than just providing this information in the output table. For MeDIP seq analysis it can completely replace the MEDIPS package -except from some QC metrics, that are not implemented in QSEA.

Best, Matthias

ADD COMMENT • link 8.7 years ago Matthias Lienhard ▴ 280

0

Entering edit mode

Hi Matthias,
Thanks a lot. It worked the way you told me. Sub-setting the chromosome set using -> chr.select=paste0("chr",c(1:19,"X",Y")) was the thing that was needed. Now MEDIPS.meth is also working with 'CNV = T'.

Can MEDIPS be used to get CNVs for one MSet say MSet1 ? :

mr.edgeR_LshKO = MEDIPS.meth(MSet1 = LshKO_5mC, CSet = CS1, ISet1 = LshKO_Input, p.adj = "bonferroni", diff.method = "edgeR", MeDIP = T, CNV = T, minRowSum = 10)

Running this above commands throws error : Cannot perform CNV analysis- please specify two groups of INPUT SETs!

Probably I'm not getting how CNVs' log2 ratios are calculated.

Regards,
Anupriya

ADD REPLY • link 8.7 years ago anupriyaverma1408 • 0

1

Entering edit mode

Hi Anupriya,

within MEDIPS, the calculation of CNV works by comparing the local coverage of the input sample(s) that contain CNVs (ISet2) to the control input sample(s) (ISet1). Therefore it is required to specify both sets.

However, probably the read coverage is not sufficient to run the CNV detection on 100 base windows. Typical values for the CNV analysis would be 100000 to 1000000 base windows. In MEDIPS, this would be done by initially specifying CNV=FALSE, and than use MEDIPS.addCNV().

I again want to encourage you to have a look at the QSEA package, which is has a much more integrated and convenient way of inferring, using, and depicting CNV information.

ADD REPLY • link 8.7 years ago Matthias Lienhard ▴ 280

0

Entering edit mode

Hi Matthias,

I've one more doubt. After getting annotations with MEDIPS.getAnnotation function I'm getting some annotations that are having only Ensembl enzyme/protein/transcript ID (no geneIDs) :

chr   start   stop   chr.1   start.1   stop.1   CF   X1.WT.5mC_S1.bam.counts   X2.LshKO.5mC_S2.bam.counts   X1.WT.5mC_S1.bam.rpkm   X2.LshKO.5mC_S2.bam.rpkm   X1.WT.5mC_S1.bam.rms   X2.LshKO.5mC_S2.bam.rms   X3.WT.Input_S3.bam.counts   X4.LshKO.Input_S4.bam.counts   X3.WT.Input_S3.bam.rpkm   X4.LshKO.Input_S4.bam.rpkm   MSets1.counts.mean   MSets1.rpkm.mean   MSets1.rms.mean   MSets2.counts.mean   MSets2.rpkm.mean   MSets2.rms.mean   edgeR.logFC   edgeR.logCPM   edgeR.p.value   edgeR.adj.p.value   CNV.log2.ratio   DMR_ID   1_id   1_id   2_id   1_id   2_id   3_id   4_id   5_id   6_id
chrX   130684601   130684700   chrX   130684601   130684700   1   146   194   177.27   317.53   0.55   0.62   38   32   5.85   9.77   146   177.27   0.55   194   317.53   0.62   -0.7990199711   5.3553704235   7.23115947184283E-07   0.0214613582   -0.0736   ID_90   NA   NA   NA   ENSMUSE00000546399   NA   NA   NA   NA   NA
chrX   130684701   130684800   chrX   130684701   130684800   2   147   196   178.48   320.8   0.55   0.61   39   31   6   9.47   147   178.48   0.55   196   320.8   0.61   -0.8039704999   5.3682293686   4.81168180090305E-07   0.0142805904   -0.0736   ID_90   NA   NA   NA   ENSMUSE00000546399   NA   NA   NA   NA   NA

chr9 3033801 3033900 chr9 3033801 3033900 2 3094 1929 3756.66 3157.3 0.85 0.83 9978 5311 1535.81 1622.26 3094 3756.66 0.85 1929 3157.3 0.83 0.2920778945 9.1580107642 3.16711714802772E-10 9.39968698363148E-06 -0.0316 ID_49 NA TSS_ENSMUST00000170073 NA NA NA NA NA NA NA

Commands used:
anno.mart.gene = MEDIPS.getAnnotation(host="may2012.archive.ensembl.org",dataset=c("mmusculus_gene_ensembl")[1],annotation=c("TSS","EXON","GENE"),tssSz=c(-1000,500),chr=chr.select1) #for mm9 annotations.

roi_new = MEDIPS.setAnnotation(regions = rois, annotation = anno.mart.gene)

How is it possible not to get geneIDs associated with transcripts/protein IDs?

As mentioned by you, I'll be trying QSEA package after I'm done with MEDIPS.

Thank you,
Anupriya

ADD REPLY • link 8.7 years ago anupriyaverma1408 • 0

1

Entering edit mode

Unfortunately, MEDIPS.getAnnotation does not fetch gene names, and the function cannot be configured, such that it does fetch them. You could either modify the function, or match the ids and change the resulting output tables manually.

ADD REPLY • link 8.7 years ago Matthias Lienhard ▴ 280