Question: SAAP-BS data analysis (RRBS) - BiSeq use and Coverage Histogram
0
gravatar for maria.maqueda
4.6 years ago by
European Union
maria.maqueda0 wrote:

Hello all,

   I am struggling with the DMR analysis of some RRBS data already processed with SAAP-BS program. I do not have the SAM/BAM files, just a merged csv file with the methylC, totalC and Ratio per sample and CpG site (around 2.5M). 

I am new to this type of analysis (previously worked with expression analysis) and I have decided to use BiSeq for obtaining DMR.  My first question is: I have to create by hand the BSraw object, is "totalReads" equivalent to "totalC" and "methReads" equivalent to "methC"?

On the other hand I have supposed that "totalC" from SAAP-BS report is equivalent to the coverage mentioned in RRBS/WGBS publications. Assuming my supposition is right (if not, please correct me!!!), I have found there are very high coverage values (up to maximum value of 16k - this located in chromosome 21, around 4k CpG sites with coverage > 200). Is this to be expected? 

If you could bring some light here I would really appreciate. Many thanks in advance!

Maria

coverage biseq rrbs saap-bs • 1.0k views
ADD COMMENTlink modified 4.5 years ago • written 4.6 years ago by maria.maqueda0
Answer: SAAP-BS data analysis (RRBS) - BiSeq use and Coverage Histogram
0
gravatar for Katja Hebestreit
4.6 years ago by
United States
Katja Hebestreit110 wrote:
Hi Maria, I don't know SAAP-BS, but from the column names I suppose that you are right: totalC corresponds to totalReads and methC and to methReads. CpG sites with very high coverages are to be expected. At least I have never seen an RRBS data set without it. In the BiSeq pipeline it is suggested to limit the coverage, e.g., to the 0.9 quantile (page 15): http://bioconductor.org/packages/release/bioc/vignettes/BiSeq/inst/doc/BiSeq.pdf Cheers, Katja ----- Original Message ----- > From: "maria.maqueda [bioc]" <noreply@bioconductor.org> > To: katjah@stanford.edu > Sent: Tuesday, April 28, 2015 8:53:15 AM > Subject: [bioc] SAAP-BS data analysis (RRBS) - BiSeq use and Coverage > Histogram > Activity on a post you are following on support.bioconductor.org > User maria.maqueda wrote Question: SAAP-BS data analysis (RRBS) - BiSeq use > and Coverage Histogram : > Hello all, > I am struggling with the DMR analysis of some RRBS data already processed > with SAAP-BS program. I do not have the SAM/BAM files, just a merged csv > file with the methylC, totalC and Ratio per sample and CpG site (around > 2.5M). > I am new to this type of analysis (previously worked with expression > analysis) and I have decided to use BiSeq for obtaining DMR. My first > question is: I have to create by hand the BSraw object, is "totalReads" > equivalent to "totalC" and "methReads" equivalent to "methC"? > On the other hand I have supposed that "totalC" from SAAP-BS report is > equivalent to the coverage mentioned in RRBS/WGBS publications. Assuming my > supposition is right (if not, please correct me!!!), I have found there are > very high coverage values (up to maximum value of 16k - this located in > chromosome 21, around 4k CpG sites with coverage > 200). Is this to be > expected? > If you could bring some light here I would really appreciate. Many thanks in > advance! > Maria > You may reply via email or visit SAAP-BS data analysis (RRBS) - BiSeq use and Coverage Histogram
ADD COMMENTlink written 4.6 years ago by Katja Hebestreit110

Hi Katja,

Many thanks for you quick answer! Yes, I saw the very high coverage filtering in the BiSeq filtering, thanks for referring.

Cheers

Maria

 

 

 

 

 

ADD REPLYlink written 4.6 years ago by maria.maqueda0
Answer: SAAP-BS data analysis (RRBS) - BiSeq use and Coverage Histogram
0
gravatar for maria.maqueda
4.5 years ago by
European Union
maria.maqueda0 wrote:

Hi again Katja,

 

  Following your suggestion to limit the coverage as indicated in the BiSeq vignette, I obtained the following error message while executing limitCov function:

"Error in methReads(object)[indCov] <- as.integer(round(fraction * maxCov)) :
  NAs are not allowed in subscripted assignments"

My "clust.unlim" object contains missing values (percentage of NAs ranges between 0.13% and 7.7% which I do not really know if this is quite normal or not for a sample size of 20. 

I have tried to smooth the data without limiting the coverage and have seen that this "NAs issue" gets worse. Could this be due to the meth levels prediction? Shall I remove this NA readings prior to constructing the CpG clusters?

Many thanks in advance for any advice!!

Maria

 

 

 

 

 

 

 

 

 

 

ADD COMMENTlink written 4.5 years ago by maria.maqueda0
Hi Maria, It sounds like you have NAs in your CpGs that are not covered. Please replace the NAs by a 0 in the totalReads and methReads slots. Does that help? Cheers, Katja ----- Original Message ----- > From: "maria.maqueda [bioc]" <noreply@bioconductor.org> > To: katjah@stanford.edu > Sent: Thursday, May 7, 2015 3:56:08 AM > Subject: [bioc] A: SAAP-BS data analysis (RRBS) - BiSeq use and Coverage > Histogram > Activity on a post you are following on support.bioconductor.org > User maria.maqueda wrote Answer: SAAP-BS data analysis (RRBS) - BiSeq use and > Coverage Histogram : > Hi again Katja, > Following your suggestion to limit the coverage as indicated in the BiSeq > vignette, I obtained the following error message while executing limitCov > function: > ​ > "Error in methReads(object)[indCov] <- as.integer(round(fraction * maxCov)) : > NAs are not allowed in subscripted assignments" > My "clust.unlim" object contains missing values (percentage of NAs ranges > between 0.13% and 7.7% which I do not really know if this is quite normal or > not for a sample size of 20. > I have tried to smooth the data without limiting the coverage and have seen > that this "NAs issue" gets worse. Could this be due to the meth levels > prediction? Shall I remove this NA readings prior to constructing the CpG > clusters? > Many thanks in advance for any advice!! > Maria > You may reply via email or visit > A: SAAP-BS data analysis (RRBS) - BiSeq use and Coverage Histogram
ADD REPLYlink written 4.5 years ago by Katja Hebestreit110

Hi Katja,

 

Yes, this helps! Many thanks for answering so quick! I am just smoothing the meth data. Let's see what I have!

Thanks again,

Cheers

Maria

 

 

 

ADD REPLYlink written 4.5 years ago by maria.maqueda0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 327 users visited in the last hour