How can I remove control probesets from the expressionset object in gene expression analysis with Affy Human Gene 1.0ST microarray
1
0
Entering edit mode
@virginia-garcia-4706
Last seen 9.6 years ago
Dear list, I am quite new to R as well as to microarray analysis. I am dealing with some gene expression analysis performed on Affymetrix Human Gene 1.0ST microarray. So far, I have learnt how to filtrate data using genefilter using nsFilter functions. Now, I would like to know how to filter out from the expressionset object all the control probesets (~4000) that Affymetrix includes in the microarray (for quality assay, normalization, background correction, etc.). However, none of the aforementioned functions worked for me. How can I recognize those probesets and remove them? I would like to filter them out before statistical analysis with limma package. Thank you very much in advance for your help. Virginia.
Microarray Normalization genefilter limma Microarray Normalization genefilter limma • 1.3k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States
Hi Virginia, On 6/21/2011 6:46 AM, Virginia Garcia wrote: > Dear list, > > I am quite new to R as well as to microarray analysis. > I am dealing with some gene expression analysis performed on Affymetrix Human > Gene 1.0ST microarray. > > So far, I have learnt how to filtrate data using genefilter using nsFilter > functions. > > Now, I would like to know how to filter out from the expressionset object all > the control probesets (~4000) that Affymetrix includes in the microarray (for > quality assay, normalization, background correction, etc.). However, none of > the aforementioned functions worked for me. > > How can I recognize those probesets and remove them? I would like to filter > them out before statistical analysis with limma package. How much do you like database stuff? Lots? Great, I have some fun for you. Assuming you have pd.hugene.1.0.st.v1 installed (I have 1.1 installed, but the queries will be the same). > library(pd.hugene.1.1.st.v1) First, get a connection to the database > con <- db(pd.hugene.1.1.st.v1) Now, what's in this thing? > dbListTables(con) [1] "chrom_dict" "core_mps" "featureSet" "level_dict" "pmfeature" [6] "table_info" "type_dict" OK, let's dig. > dbGetQuery(con, "select * from pmfeature limit 5;") fid fsetid atom x y 1 704656 7892501 1 765 711 2 1060101 7892501 2 800 1070 3 1046459 7892501 3 28 1057 4 403586 7892501 4 655 407 5 473527 7892502 5 306 478 Boring. > dbGetQuery(con, "select * from featureSet limit 5;") fsetid strand start stop transcript_cluster_id exon_id crosshyb_type level 1 7892501 NA 0 0 0 0 0 NA 2 7892502 NA 0 0 0 0 0 NA 3 7892503 NA 0 0 0 0 0 NA 4 7892504 NA 0 0 0 0 0 NA 5 7892505 NA 0 0 0 0 0 NA chrom type 1 NA 6 2 NA 7 3 NA 7 4 NA 7 5 NA 7 Maybe more interesting. What's this 'type' business? > dbGetQuery(con, "select * from type_dict limit 5;") type type_id 1 1 main 2 2 control->affx 3 3 control->chip 4 4 control->bgp->antigenomic 5 5 control->bgp->genomic Now that looks like some reasonable info. What different types are there? > dbGetQuery(con, "select * from type_dict;") type type_id 1 1 main 2 2 control->affx 3 3 control->chip 4 4 control->bgp->antigenomic 5 5 control->bgp->genomic 6 6 normgene->exon 7 7 normgene->intron 8 8 rescue->FLmRNA->unmapped So it looks like pretty much everything but type 1 are controls of some type. > tab <- dbGetQuery(con, "select * from featureSet;") > table(tab$type) 1 2 4 6 7 253002 57 45 1195 2904 So that's about 4200 control probes (2,4,6,7). How to subset from here depends on the package you are using for analysis (oligo, affy, xps), so I won't go into that. But you can now get the IDs of the probesets you care about and use them to filter. Best, Jim > > Thank you very much in advance for your help. > > Virginia. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT

Login before adding your answer.

Traffic: 1026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6