ExpressionSet subsetting problem
1
0
Entering edit mode
@iain-gallagher-2532
Last seen 8.8 years ago
United Kingdom
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20080411/ 22e145e2/attachment.pl
• 176 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 15 days ago
United States
Hi IAIN -- IAIN GALLAGHER <iaingallagher at="" btopenworld.com=""> writes: > Hi Everyone. > > I'm having a problem subsetting an ExpressionSet. After reading my > cel files in and summarizing with MAS5 I assign a new > AnnotatedDataFrame to describe the data. This is a tab delimited > text file in the following format: [snip] > pheno <- read.AnnotatedDataFrame('covdesc.txt', sep='\t') > phenoData(mas_data) <- pheno Probably the problem is here, where your new AnnotatedDataFrame has samples ordered differently from mas_data. Try validObject(mas_data). Here's a reproducible example > data(sample.ExpressionSet) > obj <- sample.ExpressionSet > pd <- phenoData(obj) > newPd <- pd[sample(sampleNames(pd)),] > phenoData(obj) <- newPd > validObject(obj) Error in validObject(obj) : invalid class "ExpressionSet" object: sampleNames differ between assayData and phenoData If I were to have newPd, and wanted to make sure the assignment were correct, I might > data(sample.ExpressionSet) > obj <- sample.ExpressionSet > phenoData(obj) <- newPd[sampleNames(obj),] > validObject(obj) The reason for this dangerous behavior traces back to the need to sometimes create transiently invalid objects in the process of transforming from one ExpressionSet to another. Martin > This seems to go well. > > I now create an index to pull out only those subjects with 'Pancreas' under 'Site'. > > panc_index <- which(phenoData(mas_data)$Site == 'Pancreas') > > This returns a vector of numbers > > 1 3 4 15 23 28 29 > > Now I subset my data with this > > kept_data <- mas_data[,panc_index] > > This is where I'm running into problems > >> head(exprs(panc_pts)) > F100.CEL F105.CEL F106.CEL F45.CEL F57.CEL F97.CEL > 1007_s_at 1853.75910 2834.19034 1865.65600 869.44930 1307.60507 2006.37103 > 1053_at 811.05343 517.32617 519.08446 490.94832 582.09189 544.34508 > 117_at 78.34070 26.91147 93.21263 129.14469 241.32762 31.05214 > 121_at 419.79056 494.92934 685.06496 478.36533 661.30741 591.22300 > 1255_g_at 84.53744 18.25635 76.71271 44.79287 69.42122 99.33932 > 1294_at 329.38568 447.23030 529.64516 369.30509 487.00975 339.38840 > F99.CEL > 1007_s_at 1168.56112 > 1053_at 425.16363 > 117_at 18.87988 > 121_at 511.47964 > 1255_g_at 54.36606 > 1294_at 372.36992 > > looks ok but whilst subjects 1,3 & 4 are pulled out appropriately (F100, F105 and F106 respectively) the next two subjects are not. F45 is sample number 14 not 15 and F57 is sample number 22 not 23. The last two samples (F97 and F99) are pulled out properly. > > Could anyone explain why this is? I'd be most grateful. > > Thanks > > iain > >> sessionInfo() > R version 2.6.2 (2008-02-08) > i486-pc-linux-gnu > > locale: > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_ GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_G B.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF -8;LC_IDENTIFICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] simpleaffy_2.14.05 gcrma_2.10.0 matchprobes_1.10.0 > [4] genefilter_1.16.0 survival_2.34 hgu133plus2cdf_2.0.0 > [7] affy_1.16.0 preprocessCore_1.0.0 affyio_1.6.1 > [10] Biobase_1.16.2 > > loaded via a namespace (and not attached): > [1] annotate_1.16.1 AnnotationDbi_1.0.6 DBI_0.2-4 > [4] rcompgen_0.1-17 RSQLite_0.6-7 > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6