minfi release 1.8

0

Entering edit mode

swarmqq@gmail.com ▴ 10

@swarmqqgmailcom-6572

Last seen 9.7 years ago

Thanks Tim for sharing this. I think it is very useful. But I kind of stuck at generating genomeRatioSet, it has error message. *> gRatioSet <- mapToGenome(ratioSet, mergeManifest = TRUE)Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : solving row 1: range cannot be determined from the supplied arguments (too many NAs)*I am not sure what's wrong? If you happen to know, could you please share it with me? Thanks a lot for the help. Emma On Saturday, February 1, 2014 2:06:59 PM UTC-5, Tim Triche, Jr. wrote: > > There are at least 3 different ways you could go about this; Tiffany > Morris' probe variant annotation package accompanying ChAMP is probably the > most comprehensive, while (I claim) the UCSC common variant track is the > most trivial and the SNPlocs packages are the most intensive. Let's assume > your data is in a GenomicRatioSet named grSet for all examples. > > Writing and testing out the code for this took a little while so I do hope > you'll work through each example. > > All things considered, I'd recommend option #1, but it would be even > better if the maintainers tweaked a few settings so that > bsseq::data.frame2GRangesprobe.450K.VCs.af) was more trivial. (Also: > consider merging $diVC.com1pop.F and $diVC.com1pop.R, making everything > that can reasonably be so boolean, etc... GRanges are so much easier to > work with than data.frames when dealing with genomic coordinates) > > You'll need to supply your own grSet, but if you didn't have one, I don't > imagine you'd have asked :-) > I have provided the dimensions (rows x columns) of the filtered grSets for > comparison in each case. You will have to decide what degree of > conservatism is most appropriate for your experiment. Read the docs! > > > 1) Use the ProbeVariants package alliuded to above. This is > comprehensive, but a bit of a PITA. > > require(minfi) > grSet > ## class: GenomicRatioSet > ## dim: 485512 34 > > require(Illumina450ProbeVariants.db) > ?probe.450K.VCs.af ## read the docs! > dataprobe.450K.VCs.af) > > commonSnpInProbe <- rownamesprobe.450K.VCs.af)[ probe.450K.VCs.af$probe50VC.com1pop > > 0 ] > grSet.noCommonSNPsInProbes <- grSet[ > setdiff(rownames(grSet), commonSnpInProbe), ] > grSet.noCommonSNPsInProbes > ## class: GenomicRatioSet > ## dim: 308901 34 > > commonVariantsEitherStrand <- !is.naprobe.450K.VCs.af$diVC.com1pop.F)|! > is.naprobe.450K.VCs.af$diVC.com1pop.R) > CpGsWithCommonVariants <- rownamesprobe.450K.VCs.af)[ commonVariantsEitherStrand > ] > grSet.noCommonVariantsAtCpGs <- grSet[ > setdiff(rownames(grSet),CpGsWithCommonVariants), ] > grSet.noCommonVariantsAtCpGs > ## class: GenomicRatioSet > ## dim: 445502 34 > > Consult the documentation for finer control over the process (e.g. > specific populations, etc.). > > ?probe.450K.VCs.af ## read the docs again > > It would not break my heart if the maintainers turned the data.frame here > into a GRanges; a few small changes followed by data.frame2GRanges() would > do the trick. But still, it's a very handy compilation. > See below for the reason I prefer GRanges (in short, because I'm impatient > and don't like debugging). > > > 2) Use the FDb package (> 1% MAF across all populations, dbSNP build 135). > This is a one-liner. > > require(minfi) > grSet > ## class: GenomicRatioSet > ## dim: 485512 34 > > require(FDb.UCSC.snp135common.hg19) > commonSNPs <- features(FDb.UCSC.snp135common.hg19) > > ## With a GRanges object, the previous rigamarole becomes a one- liner: > grSet.noCommonSNPsAtCpGs <- grSet[ grSet %outside% commonSNPs, ] > > grSet.noCommonSNPsAtCpGs > ## class: GenomicRatioSet > ## dim: 478385 34 > > This was my workaround from 2-3 years ago to permit masking of TCGA Level > 3 methylation data. It still works and it still works well, but it's been > superseded (IMHO) by more flexible approaches like #1. > > > 3) use SNPlocs (all SNPs in dbSNP; mildly annoying complication with 'ch' > vs. 'chr' seqlevels) > > require(minfi) > grSet > ## class: GenomicRatioSet > ## dim: 485512 34 > > ## work around the annoyance: > chroms <- seqlevels(grSet) > names(chroms) <- chroms > chroms <- gsub('chr', 'ch', chroms) > require(SNPlocs.Hsapiens.dbSNP.20120608) > SNPs.byChr <- GRangesList(lapply(chroms, getSNPlocs, as.GRanges=TRUE)) > ## time passes... > seqlevels(SNPs.byChr) <- gsub('ch', 'chr', seqlevels(SNPs.byChr)) ## back > to normal > genome(SNPs.byChr) <- 'hg19' ## GRCh37.p5 coordinates are identical to > hg19 save for chrMT > > ## once the above hoops have been jumped through, it's back to one- liners: > grSet.noSNPsAtCpGs <- grSet[ grSet %outside% SNPs.byChr, ] > grSet.noSNPsAtCpGs > ## class: GenomicRatioSet > ## dim: 444722 34 > > This used to be documented in the minfi code/manual somewhere, though I > don't know if it still is. > > > Statistics is the grammar of science. > Karl Pearson <http: en.wikipedia.org="" wiki="" the_grammar_of_science=""> > > > On Fri, Jan 31, 2014 at 1:06 PM, C T <off... at="" gmail.com="" <javascript:="">>wrote: > >> Any tutorial on how to remove probes that contains SNPs? >> >> On Tuesday, November 12, 2013 7:12:46 PM UTC-5, Kasper Hansen wrote: >> > >> > As part of Bioconductor 2.13, we have released minfi 1.8.x. Due to a >> > number of last minute errors, the recommended version is 1.8.3 (or >> bigger). >> > >> > Users may find that their old objects cannot be linked to annotation. >> > Please run >> > OBJECT = updateObject(OBJECT) >> > to fix this. >> > >> > Highlights include >> > * preprocessingQuantile(): an independent implementation of the same >> ideas >> > as in Tost et al. >> > * bumphunter() for finding DMRs >> > * blockFinder() for finding large hypo-methylated blocks on the 450k >> array. >> > * estimateCellCounts() for estimating cell type composition for whole >> > blood samples. The function can be extended to work on other types of >> > cells, provided suitable flow sorted data is available. >> > * the annotation now includes SNP annotation for dbSNP v132, 135 and >> 137, >> > independently annotated at JHU. >> > * getSex(): you can now get sex repeatedly, irrespective of relationship >> > status. >> > * minfiQC: find and remove outlier samples based on a sample QC criteria >> > we have found effective. >> > >> > Unfortunately, none of these handy changes are yet detailed in the >> > vignette; we are working on this. >> > >> > A manuscript is in review detailing most of these functions. >> > >> > Full NEWS below >> > >> > Best, >> > Kasper D Hansen >> > >> > o Added getMethSignal(), a convenience function for programming. >> > >> > o Changed the argument name of "type" to "what" for >> getMethSignal(). >> > >> > o Added the class "RatioSet", like "GenomicRatioSet" but without >> the >> > genome information. >> > >> > o Bugfixes to the "GenomicRatioSet()" constructor. >> > >> > o Added the method ratioConvert(), for converting a "MethylSet" >> to a >> > "RatioSet" or a "GenomicMethylSet" to a "GenomicRatioSet". >> > >> > o Fixed an issue with GenomicMethylSet() and GenomicRatioSet() >> caused >> > by a recent change to a non-exported function in the >> GenomicRanges >> > package (Reported by Gustavo Fernandez Bayon <gba... at="" gmail.com="">> <javascript:> >> > >). >> > >> > o Added fixMethOutliers for thresholding extreme observations in >> the >> > [un]methylation channels. >> > >> > o Added getSex, addSex, plotSex for estimating sex of the samples. >> > >> > o Added getQC, addQC, plotQC for a very simple quality control >> > measure. >> > >> > o Added minfiQC for a one-stop function for quality control >> measures. >> > >> > o Changed some verbose=TRUE output in various functions. >> > >> > o Added preprocessQuantile. >> > >> > o Added bumphunter method for "GenomicRatioSet". >> > >> > o Handling signed zero in minfi:::.digestMatrix which caused unit >> > tests to fail on Windows. >> > >> > o addSex and addQC lead to sampleNames() being dropped because of >> a >> > likely bug in cbind(DataFrame, DataFrame). Work-around has been >> > implemented. >> > >> > o Re-ran the test data generator. >> > >> > o Fixed some Depends and Imports issues revealed by new features >> of R >> > CMD check. >> > >> > o Added blockFinder and cpgCollapse. >> > >> > o (internal) added convenience functions for argument checking. >> > >> > o Exposed and re-wrote getAnnotation(). >> > >> > o Changed getLocations() from being a method to a simple function. >> > Arguments have been removed (for example, now the function >> always >> > drops non-mapping loci). >> > >> > o Implemented getIslandStatus(), getProbeType(), getSnpInfo() and >> > addSnpInfo(). The two later functions retrieve pre- computed SNP >> > overlaps, and the new annotation object includes SNPs based on >> > dbSNP 137, 135 and 132. >> > >> > o Changed the IlluminaMethylatioAnnotation class to now include >> > genomeBuild information as well as defaults. >> > >> > o Added estimateCellCounts for deconvolution of cell types in >> whole >> > blood. Thanks to Andrew Jaffe and Andres Houseman. >> > >> >> _______________________________________________ >> Bioconductor mailing list >> Biocon... at r-project.org <javascript:> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >

SNP Annotation GO probe SNPlocs PROcess minfi bumphunter ChAMP SNP Annotation GO probe • 1.5k views

ADD COMMENT • link 9.9 years ago swarmqq@gmail.com ▴ 10

Login before adding your answer.