removing SNP probes from Drosophila 2 microarray
1
0
Entering edit mode
Eli Moss ▴ 20
@eli-moss-4154
Last seen 9.6 years ago
Hello all, I'm interested in finding probes with lower expression level due to a SNP mismatch and removing them, hopefully giving me a better measure of expression level even in those genes with sequences modified by a hybridization experiment. I'm thinking of doing this by finding the changes between probe intensities between control and experimental arrays, which should be normally distributed, and removing all those probes outside some number of standard deviations from the mean. My question is this: is there a package that does something like this, or do I need to write it myself? Thank you, Eli -- Eli Moss Brown University, Computational Biology +1 (413) 658-4227 Eli_Moss at Brown.edu
probe probe • 718 views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 16 hours ago
Seattle, WA, United States
Hi Eli, On 07/06/2010 08:09 AM, Eli Moss wrote: > Hello all, > > I'm interested in finding probes with lower expression level due to a SNP mismatch and removing them, hopefully giving me a better measure of expression level even in those genes with sequences modified by a hybridization experiment. I'm thinking of doing this by finding the changes between probe intensities between control and experimental arrays, which should be normally distributed, and removing all those probes outside some number of standard deviations from the mean. My question is this: is there a package that does something like this, or do I need to write it myself? > This sounds like a complicated question to me and I'm not sure I understand the method you are suggesting (I'm not a statistician). But FWIW here is a simple method for determining the probes or probe sets that hit at least a SNP in dbSNP. It probably doesn't really help for your particular problem because I guess you want to know the probes that hit a SNP in your sample, not in dbSNP, but it's a good illustration of the kind of things that can easily be achieved by combining tools from the GenomicFeatures, Biostrings, BSgenome data packages and SNPlocs packages. Unfortunately there is no SNPlocs package for Drosophila at the moment so I'm using Human. The idea is to match the probes against the Human transcriptome, then to alter the transcriptome by injecting the SNPs from dbSNP in it and to match the probes against the altered transcriptome. The probes that have less matches the 2nd time are those that hit at least 1 SNP. ### Load your probes and store them in a PDict object for fast ### matching. library(hgu95av2probe) library(Biostrings) probes <- DNAStringSet(hgu95av2probe) pdict <- PDict(probes) ### Retrieve the exon/transcript locations for hg19 from UCSC. library(GenomicFeatures) txdb <- makeTranscriptDbFromUCSC("hg19") ### Extract the transcriptome. library(BSgenome.Hsapiens.UCSC.hg19) tx_seqs <- extractTranscriptsFromGenome(Hsapiens, txdb) ### Use vwhichPDict() from the Biostrings package to find the ### probes that hit each transcript. tx2probe <- vwhichPDict(pdict, tx_seqs) names(tx2probe) <- names(tx_seqs) ### Reverse the tx2probe map. reverseTx2ProbeMap <- function(tx2probe, nprobe) { vals <- rep(names(tx2probe), elementLengths(tx2probe)) keys <- unlist(tx2probe) tmp <- split(vals, keys) probe2tx <- vector("list", length=nprobe) probe2tx[as.integer(names(tmp))] <- tmp probe2tx } probe2tx <- reverseTx2ProbeMap(tx2probe, length(probes)) ### Repeat the above but now with the SNPs from dbSNP injected in the ### reference genome as IUPAC ambiguity letters. library(SNPlocs.Hsapiens.dbSNP.20100427) SNP_Hsapiens <- injectSNPs(Hsapiens, "SNPlocs.Hsapiens.dbSNP.20100427") SNP_tx_seqs <- extractTranscriptsFromGenome(SNP_Hsapiens, txdb) SNP_tx2probe <- vwhichPDict(pdict, SNP_tx_seqs) names(SNP_tx2probe) <- names(SNP_tx_seqs) SNP_probe2tx <- reverseTx2ProbeMap(SNP_tx2probe, length(probes)) ### A sanity check. all(elementLengths(SNP_probe2tx) <= elementLengths(probe2tx)) ### Index of rows in hgu95av2probe that hit at least a SNP in dbSNP. idx <- which(elementLengths(SNP_probe2tx) < elementLengths(probe2tx)) ### Corresponding probes. probes[idx] ### Corresponding probeset ids. unique(hgu95av2probe$Probe.Set.Name[idx]) Cheers, H. > Thank you, > Eli > > > -- > Eli Moss > Brown University, Computational Biology > +1 (413) 658-4227 > Eli_Moss at Brown.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT

Login before adding your answer.

Traffic: 974 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6