minfi release 1.8
0
0
Entering edit mode
@swarmqqgmailcom-6572
Last seen 9.7 years ago
Thanks Tim for sharing this. I think it is very useful. But I kind of stuck at generating genomeRatioSet, it has error message. *> gRatioSet <- mapToGenome(ratioSet, mergeManifest = TRUE)Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : solving row 1: range cannot be determined from the supplied arguments (too many NAs)*I am not sure what's wrong? If you happen to know, could you please share it with me? Thanks a lot for the help. Emma On Saturday, February 1, 2014 2:06:59 PM UTC-5, Tim Triche, Jr. wrote: > > There are at least 3 different ways you could go about this; Tiffany > Morris' probe variant annotation package accompanying ChAMP is probably the > most comprehensive, while (I claim) the UCSC common variant track is the > most trivial and the SNPlocs packages are the most intensive. Let's assume > your data is in a GenomicRatioSet named grSet for all examples. > > Writing and testing out the code for this took a little while so I do hope > you'll work through each example. > > All things considered, I'd recommend option #1, but it would be even > better if the maintainers tweaked a few settings so that > bsseq::data.frame2GRangesprobe.450K.VCs.af) was more trivial. (Also: > consider merging $diVC.com1pop.F and $diVC.com1pop.R, making everything > that can reasonably be so boolean, etc... GRanges are so much easier to > work with than data.frames when dealing with genomic coordinates) > > You'll need to supply your own grSet, but if you didn't have one, I don't > imagine you'd have asked :-) > I have provided the dimensions (rows x columns) of the filtered grSets for > comparison in each case. You will have to decide what degree of > conservatism is most appropriate for your experiment. Read the docs! > > > 1) Use the ProbeVariants package alliuded to above. This is > comprehensive, but a bit of a PITA. > > require(minfi) > grSet > ## class: GenomicRatioSet > ## dim: 485512 34 > > require(Illumina450ProbeVariants.db) > ?probe.450K.VCs.af ## read the docs! > dataprobe.450K.VCs.af) > > commonSnpInProbe <- rownamesprobe.450K.VCs.af)[ probe.450K.VCs.af$probe50VC.com1pop > > 0 ] > grSet.noCommonSNPsInProbes <- grSet[ > setdiff(rownames(grSet), commonSnpInProbe), ] > grSet.noCommonSNPsInProbes > ## class: GenomicRatioSet > ## dim: 308901 34 > > commonVariantsEitherStrand <- !is.naprobe.450K.VCs.af$diVC.com1pop.F)|! > is.naprobe.450K.VCs.af$diVC.com1pop.R) > CpGsWithCommonVariants <- rownamesprobe.450K.VCs.af)[ commonVariantsEitherStrand > ] > grSet.noCommonVariantsAtCpGs <- grSet[ > setdiff(rownames(grSet),CpGsWithCommonVariants), ] > grSet.noCommonVariantsAtCpGs > ## class: GenomicRatioSet > ## dim: 445502 34 > > Consult the documentation for finer control over the process (e.g. > specific populations, etc.). > > ?probe.450K.VCs.af ## read the docs again > > It would not break my heart if the maintainers turned the data.frame here > into a GRanges; a few small changes followed by data.frame2GRanges() would > do the trick. But still, it's a very handy compilation. > See below for the reason I prefer GRanges (in short, because I'm impatient > and don't like debugging). > > > 2) Use the FDb package (> 1% MAF across all populations, dbSNP build 135). > This is a one-liner. > > require(minfi) > grSet > ## class: GenomicRatioSet > ## dim: 485512 34 > > require(FDb.UCSC.snp135common.hg19) > commonSNPs <- features(FDb.UCSC.snp135common.hg19) > > ## With a GRanges object, the previous rigamarole becomes a one- liner: > grSet.noCommonSNPsAtCpGs <- grSet[ grSet %outside% commonSNPs, ] > > grSet.noCommonSNPsAtCpGs > ## class: GenomicRatioSet > ## dim: 478385 34 > > This was my workaround from 2-3 years ago to permit masking of TCGA Level > 3 methylation data. It still works and it still works well, but it's been > superseded (IMHO) by more flexible approaches like #1. > > > 3) use SNPlocs (all SNPs in dbSNP; mildly annoying complication with 'ch' > vs. 'chr' seqlevels) > > require(minfi) > grSet > ## class: GenomicRatioSet > ## dim: 485512 34 > > ## work around the annoyance: > chroms <- seqlevels(grSet) > names(chroms) <- chroms > chroms <- gsub('chr', 'ch', chroms) > require(SNPlocs.Hsapiens.dbSNP.20120608) > SNPs.byChr <- GRangesList(lapply(chroms, getSNPlocs, as.GRanges=TRUE)) > ## time passes... > seqlevels(SNPs.byChr) <- gsub('ch', 'chr', seqlevels(SNPs.byChr)) ## back > to normal > genome(SNPs.byChr) <- 'hg19' ## GRCh37.p5 coordinates are identical to > hg19 save for chrMT > > ## once the above hoops have been jumped through, it's back to one- liners: > grSet.noSNPsAtCpGs <- grSet[ grSet %outside% SNPs.byChr, ] > grSet.noSNPsAtCpGs > ## class: GenomicRatioSet > ## dim: 444722 34 > > This used to be documented in the minfi code/manual somewhere, though I > don't know if it still is. > > > Statistics is the grammar of science. > Karl Pearson <http: en.wikipedia.org="" wiki="" the_grammar_of_science=""> > > > On Fri, Jan 31, 2014 at 1:06 PM, C T <off... at="" gmail.com="" <javascript:="">>wrote: > >> Any tutorial on how to remove probes that contains SNPs? >> >> On Tuesday, November 12, 2013 7:12:46 PM UTC-5, Kasper Hansen wrote: >> > >> > As part of Bioconductor 2.13, we have released minfi 1.8.x. Due to a >> > number of last minute errors, the recommended version is 1.8.3 (or >> bigger). >> > >> > Users may find that their old objects cannot be linked to annotation. >> > Please run >> > OBJECT = updateObject(OBJECT) >> > to fix this. >> > >> > Highlights include >> > * preprocessingQuantile(): an independent implementation of the same >> ideas >> > as in Tost et al. >> > * bumphunter() for finding DMRs >> > * blockFinder() for finding large hypo-methylated blocks on the 450k >> array. >> > * estimateCellCounts() for estimating cell type composition for whole >> > blood samples. The function can be extended to work on other types of >> > cells, provided suitable flow sorted data is available. >> > * the annotation now includes SNP annotation for dbSNP v132, 135 and >> 137, >> > independently annotated at JHU. >> > * getSex(): you can now get sex repeatedly, irrespective of relationship >> > status. >> > * minfiQC: find and remove outlier samples based on a sample QC criteria >> > we have found effective. >> > >> > Unfortunately, none of these handy changes are yet detailed in the >> > vignette; we are working on this. >> > >> > A manuscript is in review detailing most of these functions. >> > >> > Full NEWS below >> > >> > Best, >> > Kasper D Hansen >> > >> > o Added getMethSignal(), a convenience function for programming. >> > >> > o Changed the argument name of "type" to "what" for >> getMethSignal(). >> > >> > o Added the class "RatioSet", like "GenomicRatioSet" but without >> the >> > genome information. >> > >> > o Bugfixes to the "GenomicRatioSet()" constructor. >> > >> > o Added the method ratioConvert(), for converting a "MethylSet" >> to a >> > "RatioSet" or a "GenomicMethylSet" to a "GenomicRatioSet". >> > >> > o Fixed an issue with GenomicMethylSet() and GenomicRatioSet() >> caused >> > by a recent change to a non-exported function in the >> GenomicRanges >> > package (Reported by Gustavo Fernandez Bayon <gba... at="" gmail.com="">> <javascript:> >> > >). >> > >> > o Added fixMethOutliers for thresholding extreme observations in >> the >> > [un]methylation channels. >> > >> > o Added getSex, addSex, plotSex for estimating sex of the samples. >> > >> > o Added getQC, addQC, plotQC for a very simple quality control >> > measure. >> > >> > o Added minfiQC for a one-stop function for quality control >> measures. >> > >> > o Changed some verbose=TRUE output in various functions. >> > >> > o Added preprocessQuantile. >> > >> > o Added bumphunter method for "GenomicRatioSet". >> > >> > o Handling signed zero in minfi:::.digestMatrix which caused unit >> > tests to fail on Windows. >> > >> > o addSex and addQC lead to sampleNames() being dropped because of >> a >> > likely bug in cbind(DataFrame, DataFrame). Work-around has been >> > implemented. >> > >> > o Re-ran the test data generator. >> > >> > o Fixed some Depends and Imports issues revealed by new features >> of R >> > CMD check. >> > >> > o Added blockFinder and cpgCollapse. >> > >> > o (internal) added convenience functions for argument checking. >> > >> > o Exposed and re-wrote getAnnotation(). >> > >> > o Changed getLocations() from being a method to a simple function. >> > Arguments have been removed (for example, now the function >> always >> > drops non-mapping loci). >> > >> > o Implemented getIslandStatus(), getProbeType(), getSnpInfo() and >> > addSnpInfo(). The two later functions retrieve pre- computed SNP >> > overlaps, and the new annotation object includes SNPs based on >> > dbSNP 137, 135 and 132. >> > >> > o Changed the IlluminaMethylatioAnnotation class to now include >> > genomeBuild information as well as defaults. >> > >> > o Added estimateCellCounts for deconvolution of cell types in >> whole >> > blood. Thanks to Andrew Jaffe and Andres Houseman. >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Biocon... at r-project.org <javascript:> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >
SNP Annotation GO probe SNPlocs PROcess minfi bumphunter ChAMP SNP Annotation GO probe • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6