MEDIPS
1
0
Entering edit mode
@paolo-kunderfranco-5158
Last seen 6.8 years ago
Dear Lucas Chavez, Repeat masking of the reference genome assembly should be considered before aligning my samples? Or I may loose information? Thanks, Paolo 2013/2/19 Lukas Chavez <lukas.chavez.mailings@googlemail.com> > > Dear Paolo, > > > why in the example refered in the manual there is only one INPUT.SET for > two conditions? > > Currently, MEDIPS allows for only one control, one treatment, and one > combined Input data set. However, there is obviously a desperate need for > considering replicates per group as well as individual Input data sets. > Therefore (and because of many other issues), I have extensively revised > the MEDIPS package which will allow for processing replicates per condition > as well as two groups of Input data. I intend to update the MEDIPS package > as soon as possible, especially in advance of the next Bioconductor > release. Nevertheless, it is not clear how you designed your experiments > and your analysis strategy? The MEDIPS update will be helpful, e.g. in case > you are comparing two groups of IP-seq samples and you want to consider two > according groups of Input samples in order to identify genomic variants > that influence the IP enrichments. > > >I followed MEDUSA protocol MEDUSA protocol (...) > > I greatly appreciate that MEDIPS has been incorporated in other analysis > pipelines. However, please excuse that I can only comment on issues and > functionalities of the MEDIPS package. > > > (...) when I filter out for non-unique reads. Roughly 90 % are discarded > (...) > > This issue may refer to amplification and oversequencing problems and > there are different opinions about unique reads. However, the fraction of > non-unique reads in you sequencing data is an issue that goes beyond what I > can discuss here. Currently, MEDIPS allows for considering all reads or for > replacing all unique reads (or maybe better: reads that map to the same > genomic position) by one representative. However, you can pre-filter your > input files by any estimate of global or local thresholds for non- unique > reads and continue using MEDIPS by considering all given mapping results. > > > >Is it possible that such a low number of reads is sufficient to generate a > saturated and reproducible methylation profile? > > This depends on the methylaion status of your reference genome. In case > you are studying the methylation status of a small and only barely > methylated genome, your results might be reasonable. > > All the best, > Lukas > > > > Dear All, > > I will now start and anlyze some MeDIP seq data with MEDIPS Bioconductor > Package > > I went through reading all the MEDIPS manual, > > I have to compare methylation profile of two cell lines, I have the Input > of both of them > , > why in the example refered in the manual there is only one INPUT.SET for > two conditions? > > CONTROL.SET, TREAT.SET, and INPUT.SET > > > Any suggestions? > > Thanks, > Paolo > > > On Fri, Feb 15, 2013 at 5:33 AM, Paolo Kunderfranco < > paolo.kunderfranco@gmail.com> wrote: > >> Dear Lucas Chavez >> >> I followed MEDUSA protocol to filter out both not properly paired, low >> quality mapping and non-unique sequences from my alignment files to use >> MEDIPS fur further analysis of DMR. >> >> For example one mC sample started with 100 milions reads. 80 % mapped, 70 >> % >> of them properly mapped with high quility (mapQ>40). >> The problem arises when I filter out for non-unique reads. Roughly 90 % >> are >> discarded leading to a final number of 2-4 milions of reads. >> All my mC samples behave in the same way. >> >> Maybe the DNA starting material was not properly quantified (2-3 ng >> instead >> of 5 ng were used for the generation of the libraries). >> We didn't observe the same problem for the Input DNA ( correctly >> quantified) and for 2 samples out of 4 for 5-hydroxy-mC. >> >> The high number of non-unique reads could be due to a technical problem or >> a biological problem? Have you ever experienced a similar problem? >> How do you think I should proceed with the analysis? Is it absolutely >> necessary to remove non-unique reads for MEDIPS analysis? >> >> Is the first time I deal with this kind of analysis I would like to >> undestand which is the best approach to follow. >> >> I tried to run MEDIPS.saturationAnalysis with the following samples and >> the >> correalation looks fine: >> >> $numberReads >> [1] 1890528 >> >> $maxEstCor >> [1] 1.890528e+06 9.997250e-01 >> >> $maxTruCor >> [1] 9.452640e+05 9.994605e-01 >> >> Is it possible that such a low number of reads is sufficient to generate a >> saturated and reproducible methylation profile? >> >> Thank you very much for your time, >> Paolo >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
Sequencing Alignment MEDIPS Sequencing Alignment MEDIPS • 1.2k views
ADD COMMENT
0
Entering edit mode
Lukas Chavez ▴ 570
@lukas-chavez-5781
Last seen 6.2 years ago
USA/La Jolla/UCSD
Dear Paolo, although it is a very general question not really specific for my MEDIPS package, here is my comment: by using the non-masked reference genome you might be able to cover fractions of repetitive DNA depending on your read length and paired/ single end sequencing. In my opinion, there are no major advantages for using the masked reference genome as reference for mapping. Lukas On Wed, Apr 3, 2013 at 1:15 AM, Paolo Kunderfranco < paolo.kunderfranco@gmail.com> wrote: > Dear Lucas Chavez, > > Repeat masking of the reference genome assembly should be considered > before aligning my samples? Or I may loose information? > > Thanks, > > Paolo > > > > 2013/2/19 Lukas Chavez <lukas.chavez.mailings@googlemail.com> > >> >> Dear Paolo, >> >> > why in the example refered in the manual there is only one INPUT.SET for >> two conditions? >> >> Currently, MEDIPS allows for only one control, one treatment, and one >> combined Input data set. However, there is obviously a desperate need for >> considering replicates per group as well as individual Input data sets. >> Therefore (and because of many other issues), I have extensively revised >> the MEDIPS package which will allow for processing replicates per condition >> as well as two groups of Input data. I intend to update the MEDIPS package >> as soon as possible, especially in advance of the next Bioconductor >> release. Nevertheless, it is not clear how you designed your experiments >> and your analysis strategy? The MEDIPS update will be helpful, e.g. in case >> you are comparing two groups of IP-seq samples and you want to consider two >> according groups of Input samples in order to identify genomic variants >> that influence the IP enrichments. >> >> >I followed MEDUSA protocol MEDUSA protocol (...) >> >> I greatly appreciate that MEDIPS has been incorporated in other analysis >> pipelines. However, please excuse that I can only comment on issues and >> functionalities of the MEDIPS package. >> >> > (...) when I filter out for non-unique reads. Roughly 90 % are >> discarded (...) >> >> This issue may refer to amplification and oversequencing problems and >> there are different opinions about unique reads. However, the fraction of >> non-unique reads in you sequencing data is an issue that goes beyond what I >> can discuss here. Currently, MEDIPS allows for considering all reads or for >> replacing all unique reads (or maybe better: reads that map to the same >> genomic position) by one representative. However, you can pre- filter your >> input files by any estimate of global or local thresholds for non- unique >> reads and continue using MEDIPS by considering all given mapping results. >> >> >> >Is it possible that such a low number of reads is sufficient to generate >> a >> saturated and reproducible methylation profile? >> >> This depends on the methylaion status of your reference genome. In case >> you are studying the methylation status of a small and only barely >> methylated genome, your results might be reasonable. >> >> All the best, >> Lukas >> >> >> >> Dear All, >> >> I will now start and anlyze some MeDIP seq data with MEDIPS Bioconductor >> Package >> >> I went through reading all the MEDIPS manual, >> >> I have to compare methylation profile of two cell lines, I have the Input >> of both of them >> , >> why in the example refered in the manual there is only one INPUT.SET for >> two conditions? >> >> CONTROL.SET, TREAT.SET, and INPUT.SET >> >> >> Any suggestions? >> >> Thanks, >> Paolo >> >> >> On Fri, Feb 15, 2013 at 5:33 AM, Paolo Kunderfranco < >> paolo.kunderfranco@gmail.com> wrote: >> >>> Dear Lucas Chavez >>> >>> I followed MEDUSA protocol to filter out both not properly paired, low >>> quality mapping and non-unique sequences from my alignment files to use >>> MEDIPS fur further analysis of DMR. >>> >>> For example one mC sample started with 100 milions reads. 80 % mapped, >>> 70 % >>> of them properly mapped with high quility (mapQ>40). >>> The problem arises when I filter out for non-unique reads. Roughly 90 % >>> are >>> discarded leading to a final number of 2-4 milions of reads. >>> All my mC samples behave in the same way. >>> >>> Maybe the DNA starting material was not properly quantified (2-3 ng >>> instead >>> of 5 ng were used for the generation of the libraries). >>> We didn't observe the same problem for the Input DNA ( correctly >>> quantified) and for 2 samples out of 4 for 5-hydroxy-mC. >>> >>> The high number of non-unique reads could be due to a technical problem >>> or >>> a biological problem? Have you ever experienced a similar problem? >>> How do you think I should proceed with the analysis? Is it absolutely >>> necessary to remove non-unique reads for MEDIPS analysis? >>> >>> Is the first time I deal with this kind of analysis I would like to >>> undestand which is the best approach to follow. >>> >>> I tried to run MEDIPS.saturationAnalysis with the following samples and >>> the >>> correalation looks fine: >>> >>> $numberReads >>> [1] 1890528 >>> >>> $maxEstCor >>> [1] 1.890528e+06 9.997250e-01 >>> >>> $maxTruCor >>> [1] 9.452640e+05 9.994605e-01 >>> >>> Is it possible that such a low number of reads is sufficient to generate >>> a >>> saturated and reproducible methylation profile? >>> >>> Thank you very much for your time, >>> Paolo >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6