filtering on external genelist
2
0
Entering edit mode
Dion Lepp ▴ 30
@dion-lepp-2369
Last seen 9.7 years ago
Oleg Moskvin <ovm at="" ...=""> writes: > > Colleagues, > > I think this should be pretty simple task but I cannot find an appropriate > package for that. > I need to generate a subset of eSet object which contains certain probesets > indicated in an external genelist (outside R environment). > > I.e. this procedure should look like this: > > mylist <- read.table ..... > fltered.eset <- someFunction(eSet, mylist) > > Probably this is already implemented somewhere. > Any hints will be appreciated. > > All the best, > > Oleg > > _______________________________________________ > Bioconductor mailing list > Bioconductor at ... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > I have the exact same question. I am working with 2-color data in limma however. I'd like to be able to make a table of Mvalues corresponding to a list of geneIDs from an external table. Any help is appreciated. Thanks, D
• 652 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 27 days ago
United States
"James W. MacDonald" <jmacdon at="" med.umich.edu=""> writes: > D wrote: >> Oleg Moskvin <ovm at="" ...=""> writes: >> >>> Colleagues, >>> >>> I think this should be pretty simple task but I cannot find an appropriate >>> package for that. >>> I need to generate a subset of eSet object which contains certain probesets >>> indicated in an external genelist (outside R environment). >>> >>> I.e. this procedure should look like this: >>> >>> mylist <- read.table ..... >>> fltered.eset <- someFunction(eSet, mylist) >>> >>> Probably this is already implemented somewhere. >>> Any hints will be appreciated. >>> >>> All the best, >>> >>> Oleg >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at ... >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> I have the exact same question. I am working with 2-color data in limma >> however. I'd like to be able to make a table of Mvalues corresponding to a list >> of geneIDs from an external table. Any help is appreciated. > > That is not the same question, really. Your question should be easily > answered by reading 'An Introduction to R', as that is a simple > subsetting problem. Maybe helpful to know that MALists contain or can be made to contain (e.g., when reading in the original data files) whatever information the manufacturer might provide in terms of additional annotations. You might then do something like (the details depend entirely on how the MAList object was created) > idx <- ma$genes$Labels %in% c("EST1", "Actin") > ma1 <- ma[idx,] where this creates a (logical) index and then uses it for subsetting. > The answer to the original question is also pretty simple. I don't know > if this is documented somewhere, but I think the principle of least > surprise applies here: > > mylist <- read.table("my_external_list") > filtered.eset <- original.eset[mylist,] > > As an example: > > > library(fibroEset) > > data(fibroEset) > > thenames <- featureNames(fibroEset)[sample(1:12625, 300)] > > subsetted.eset <- fibroEset[thenames,] > > subsetted.eset > ExpressionSet (storageMode: lockedEnvironment) > assayData: 300 features, 46 samples > element names: exprs > phenoData > sampleNames: 1, 2, ..., 46 (46 total) > varLabels and varMetadata: > samp: sample code > species: h: human, b: bonobo, g: gorilla > featureData > rowNames: 37599_at, 34494_at, ..., 36333_at (300 total) > varLabels and varMetadata: none > experimentData: use 'experimentData(object)' > pubMedIds: 12840040 > Annotation [1] "hgu95av2" A bit trickier when thenames are not probesets. One can use the maps in the annotation package to get there, though, e.g., from SYMBOL: > library(hgu95av2) > rmap <- l2e(reverseSplit(as.list(hgu95av2SYMBOL))) > head(ls(rmap)) [1] "2'-PDE" "3.8-1" "76P" "AADAC" "AAK1" "AAMP" > rmap[["AADAC"]] [1] "36512_at" > thenames <- head(ls(rmap)) # the sybmols we're looking for? > mget(thenames, rmap) $`2'-PDE` [1] "38144_at" $`3.8-1` [1] "34934_at" $`76P` [1] "40985_g_at" "40986_s_at" "40984_at" $AADAC [1] "36512_at" $AAK1 [1] "34949_at" "40628_at" "39456_at" "40572_at" "39463_at" $AAMP [1] "38434_at" > idx <- unique(unlist(mget(thenames, rmap), use.names=FALSE)) > fibroEset[idx,] ExpressionSet (storageMode: lockedEnvironment) assayData: 12 features, 46 samples element names: exprs phenoData sampleNames: 1, 2, ..., 46 (46 total) varLabels and varMetadata description: samp: sample code species: h: human, b: bonobo, g: gorilla featureData featureNames: 38144_at, 34934_at, ..., 38434_at (12 total) fvarLabels and fvarMetadata description: none experimentData: use 'experimentData(object)' pubMedIds: 12840040 Annotation: hgu95av2 This will be a bit simpler in the forthcoming release, where the AnnotationDbi package provides 'revmap'. Martin > Best, > > Jim > > >> >> Thanks, >> >> D >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Affymetrix and cDNA Microarray Core > University of Michigan Cancer Center > 1500 E. Medical Center Drive > 7410 CCGC > Ann Arbor MI 48109 > 734-647-5623 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Bioconductor / Computational Biology http://bioconductor.org
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States
D wrote: > Oleg Moskvin <ovm at="" ...=""> writes: > >> Colleagues, >> >> I think this should be pretty simple task but I cannot find an appropriate >> package for that. >> I need to generate a subset of eSet object which contains certain probesets >> indicated in an external genelist (outside R environment). >> >> I.e. this procedure should look like this: >> >> mylist <- read.table ..... >> fltered.eset <- someFunction(eSet, mylist) >> >> Probably this is already implemented somewhere. >> Any hints will be appreciated. >> >> All the best, >> >> Oleg >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at ... >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > I have the exact same question. I am working with 2-color data in limma > however. I'd like to be able to make a table of Mvalues corresponding to a list > of geneIDs from an external table. Any help is appreciated. That is not the same question, really. Your question should be easily answered by reading 'An Introduction to R', as that is a simple subsetting problem. The answer to the original question is also pretty simple. I don't know if this is documented somewhere, but I think the principle of least surprise applies here: mylist <- read.table("my_external_list") filtered.eset <- original.eset[mylist,] As an example: > library(fibroEset) > data(fibroEset) > thenames <- featureNames(fibroEset)[sample(1:12625, 300)] > subsetted.eset <- fibroEset[thenames,] > subsetted.eset ExpressionSet (storageMode: lockedEnvironment) assayData: 300 features, 46 samples element names: exprs phenoData sampleNames: 1, 2, ..., 46 (46 total) varLabels and varMetadata: samp: sample code species: h: human, b: bonobo, g: gorilla featureData rowNames: 37599_at, 34494_at, ..., 36333_at (300 total) varLabels and varMetadata: none experimentData: use 'experimentData(object)' pubMedIds: 12840040 Annotation [1] "hgu95av2" Best, Jim > > Thanks, > > D > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT

Login before adding your answer.

Traffic: 400 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6