remove.dupEntrez from nsFilter{genefilter}
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Dear List, I seem to have a problem with the nsFilter function. For genes which are represented by more than one probe, it should keep the probe with the highest IQR and delete the others. It seems to me that in the example below this is not the case. Any ideas why? Best Klemens require(genefilter) con <- url('http://rdf.ait.ac.at/attachments/download/102/test.es') load(con) close(con) # the expressionset test.es contains two probes for the ACT3 genes, where A_23_P160354 is the one with the highest IQR. exprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ] applyexprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ], 1, IQR) #However, if I apply nsFilter, the other Probe is kept remDup <- nsFiltertest.es, var.filter=F)$eset featureNames(remDup)[fData(remDup)$ENTREZ %in% 10000] exprs(remDup)[fData(remDup)$ENTREZ %in% 10000] -- output of sessionInfo(): > sessionInfo() R version 2.13.2 (2011-09-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] HsAgilentDesign026652.db_2.5.0 org.Hs.eg.db_2.5.0 RSQLite_0.10.0 DBI_0.2-5 [5] AnnotationDbi_1.16.11 genefilter_1.34.0 Biobase_2.12.2 loaded via a namespace (and not attached): [1] annotate_1.30.1 IRanges_1.10.6 splines_2.13.2 survival_2.36-9 tools_2.13.2 xtable_1.6-0 > -- Sent via the guest posting facility at bioconductor.org.
probe probe • 811 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 34 minutes ago
United States
Hi Klemens, On 5/29/2012 1:45 PM, Klemens Vierlinger [guest] wrote: > Dear List, > I seem to have a problem with the nsFilter function. For genes which are represented by more than one probe, it should keep the probe with the highest IQR and delete the others. It seems to me that in the example below this is not the case. > > Any ideas why? It has to do with how the IQR is calculated. > x <- genefilter:::rowIQRs(exprstest.es)) > names(x) <- featureNamestest.es) > y <- apply(exprstest.es), 1, IQR) > names(y) <- featureNamestest.es) > z <- get("AKT3", revmap(HsAgilentDesign026652SYMBOL)) > data.frame(x[z], y[z], row.names = z) x.z. y.z. A_23_P160354 2.162743 2.096952 A_24_P110983 2.177948 2.071818 So as far as genefilter is concerned, A_24_P110983 has a higher IQR. Best, Jim > > Best > Klemens > > > > require(genefilter) > con<- url('http://rdf.ait.ac.at/attachments/download/102/test.es') > load(con) > close(con) > > # the expressionset test.es contains two probes for the ACT3 genes, where A_23_P160354 is the one with the highest IQR. > exprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ] > apply(exprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ], 1, IQR) > > #However, if I apply nsFilter, the other Probe is kept > remDup<- nsFiltertest.es, var.filter=F)$eset > featureNames(remDup)[fData(remDup)$ENTREZ %in% 10000] > exprs(remDup)[fData(remDup)$ENTREZ %in% 10000] > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 2.13.2 (2011-09-30) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 > [4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] HsAgilentDesign026652.db_2.5.0 org.Hs.eg.db_2.5.0 RSQLite_0.10.0 DBI_0.2-5 > [5] AnnotationDbi_1.16.11 genefilter_1.34.0 Biobase_2.12.2 > > loaded via a namespace (and not attached): > [1] annotate_1.30.1 IRanges_1.10.6 splines_2.13.2 survival_2.36-9 tools_2.13.2 xtable_1.6-0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 11 hours ago
United States
On 05/29/2012 10:45 AM, Klemens Vierlinger [guest] wrote: > > Dear List, > I seem to have a problem with the nsFilter function. For genes which are represented by more than one probe, it should keep the probe with the highest IQR and delete the others. It seems to me that in the example below this is not the case. > > Any ideas why? Hi Klemens -- Nice question. When I look at ?nsFilter and selectMethod(nsFilter, "ExpressionSet"), I see IQR's calculated as > iqr = genefilter:::rowIQRs(exprstest.es)) > iqr[fDatatest.es)$ENTREZ %in% 10000] [1] 2.162743 2.177948 When I look at ?IQR and ?quantile, I see that there are 9 types of IQR from which I could chose. With type=3 I get > apply(exprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ], 1, IQR, type=3) A_23_P160354 A_24_P110983 2.162743 2.177948 Apparently the recommended is 8 and R defaults to 7. I could use > remDup <- nsFiltertest.es, var.filter=FALSE, var.func=IQR)$eset > featureNames(remDup)[fData(remDup)$ENTREZ %in% 10000] [1] "A_23_P160354" or var.func = function(x) IQR(x, type=8) Martin > Best > Klemens > > > > require(genefilter) > con<- url('http://rdf.ait.ac.at/attachments/download/102/test.es') > load(con) > close(con) > > # the expressionset test.es contains two probes for the ACT3 genes, where A_23_P160354 is the one with the highest IQR. > exprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ] > apply(exprstest.es)[fDatatest.es)$ENTREZ %in% 10000, ], 1, IQR) > > #However, if I apply nsFilter, the other Probe is kept > remDup<- nsFiltertest.es, var.filter=F)$eset > featureNames(remDup)[fData(remDup)$ENTREZ %in% 10000] > exprs(remDup)[fData(remDup)$ENTREZ %in% 10000] > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 2.13.2 (2011-09-30) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 > [4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] HsAgilentDesign026652.db_2.5.0 org.Hs.eg.db_2.5.0 RSQLite_0.10.0 DBI_0.2-5 > [5] AnnotationDbi_1.16.11 genefilter_1.34.0 Biobase_2.12.2 > > loaded via a namespace (and not attached): > [1] annotate_1.30.1 IRanges_1.10.6 splines_2.13.2 survival_2.36-9 tools_2.13.2 xtable_1.6-0 >> > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 841 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6