Hello!
Could you help me with my challenges?
I have paired-end data obtained with RRBS expriment.
I did the adapter and quality trimming using trim_galore.
Then i the read alignments the methylation extraction using Bismark .
And farther I worked with the produced coverage files, is it all right?
==============================================
After the smothing step (BSmooth(My.bsseq.data, , mc.cores = 1,verbose = TRUE)),
I see a lot of warnings (more than 50) looks like this:
... 50: In lfpros(x, y, weights = weights, cens = cens, base = base,
... : procv: parameters out of bounds
Is it all right?
==============================================
My data consists of 8 normal samples and 8 tumour samples of cancer patients.
The average coverage of CpGs on the considered chromosome
(for example, 7th chromosome) is 2.4 (from 0.4 to 4.6 in different samples).
Number of CpGs in this chromosome is 1 247 716,
Number of CpGs which are covered by at least 1 read in all 16 samples is 18 543,
Number of CpGs with 0 coverage in all samples is 0...
And something similar is observed in all chromosomes.
There is absent CpGs with 0 coverage in all samples.
Is there something wrong with my data? Is it possible to continue the analysis?
==============================================
According to the document "Analyzing WGBS with the bsseq package",
I want to remove CpGs with little or no coverage before computing t-statistics to avoid finding false positive DMRs.
In your example with 3+3=6 samples you recommended the following:
keepLoci.ex <-
which(rowSums(BS.cov[, BS.cancer.ex$Type == "cancer"] >= 2) >= 2 &
rowSums(BS.cov[, BS.cancer.ex$Type == "normal"] >= 2) >= 2).
But what is correct for my data(8+8=16 samples)?
I tried this command with the following parameters (1 and 2):
myLoci1 <-
which(rowSums(BS.cov[, BS.cancer.ex$Type == "cancer"] >= 1) >= 2 &
rowSums(BS.cov[, BS.cancer.ex$Type == "normal"] >= 1) >= 2)
Then length(myLoci1) is equal 320 019; the average coverage of remaining CpGs is 8.89 (from 0.5 to 17.7 per sample!)
I tried this command with another parameters (1 and 5):
myLoci2 <-
which(rowSums(BS.cov[, BS.cancer.ex$Type == "cancer"] >= 1) >= 5 &
rowSums(BS.cov[, BS.cancer.ex$Type == "normal"] >= 1) >= 5)
Then length(myLoci2) is equal 136 610; the average coverage of remaining CpGs is 18.59 (from 0.7 to 39.2 per sample!)
And also I tried this command with another parameters (2 and 6):
myLoci3 <-
which(rowSums(BS.cov[, BS.cancer.ex$Type == "cancer"] >= 2) >= 6 &
rowSums(BS.cov[, BS.cancer.ex$Type == "normal"] >= 2) >= 6)
Then length(myLoci3) is equal 68 594; the average coverage of remaining CpGs is 31.08 (from 0.9 to 70.0!!!!)
My data is far from ideal, but you need to somehow try to process them and get DMR. What parameters should I use?
Thank you!
Excuse me, yes, it's true.