How to create a MAList or ExprSet object from a matrix

0

Entering edit mode

swang ▴ 120

@swang-1798

Last seen 9.6 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060815/ d40b174e/attachment.pl

• 731 views

ADD COMMENT • link updated 17.7 years ago by Francois Pepin ★ 1.3k • written 17.7 years ago by swang ▴ 120

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 3 hours ago

United States

swang <swang2000 at="" gmail.com=""> writes: > Dear List: > > I got a file like the following, I guess the data is M ( log2 expression > ratio) from microarray: > > 56071 1052 1062 3061 3081 8052 8072 10061 10062 10072 1415670_at > 8.430148 8.899385 8.625973 8.708319 8.759182 8.281378 8.905347 8.625347 > the rows are Affymetrix probe and columns are different mice number (arrays) > I need to do a category analysis using category package, so I need to > generate a MAList or ExprSet object. Starting with a data matrix > samples <- 3 > sampleNames <- letters[1:samples] > features <- 1000 > ## raw data > exprMatrix <- matrix(0, ncol=samples, + nrow=features, + dimnames=list(1:features, sampleNames)) To create an old-style exprSet (not sure what an ExprSet is, or which package you mean by Category ;): > ## phenoData for exprSet > pd2 <- new("phenoData", + pData=data.frame(1:samples, + row.names=sampleNames), + varLabels=list(id="sample identifier")) > new("exprSet", + phenoData=pd2, + exprs=exprMatrix) Expression Set (exprSet) with 1000 genes 3 samples phenoData object with 1 variables and 3 cases varLabels id: sample identifier To create an ExpressionSet (using this will require different commands from the vignette that comes with Category) object: > ## phenoData for ExpressionSet > pd1 <- new("AnnotatedDataFrame", + data= + data.frame(sampleId=1:samples, + row.names=sampleNames), + varMetadata= + data.frame(labelDescription=I(c("Sample numeric identifier")), + row.names=c("sampleId"))) > new("ExpressionSet", + phenoData=pd1, exprs=exprMatrix) Instance of ExpressionSet assayData Storage mode: lockedEnvironment featureNames: 1, 2, 3, ..., 999, 1000 (1000 total) Dimensions: exprs Rows 1000 Samples 3 phenoData sampleNames: a, b, c varLabels and descriptions: sampleId: Sample numeric identifier Experiment data Experimenter name: Laboratory: Contact information: Title: URL: PMIDs: No abstract available. Annotation character(0) Much of the functionality of exprSet and ExpressionSet come from associating phenoData with expression values; the skeletons above do not have any meaningful phenoData. Typically you might incorporate this by reading phenotypic data from a spreadsheet or tab-delimited file (e.g., using read.table) into data.frames, and then incorporating the data.frame into an ExpressionSet as outlined above. > sessionInfo() Version 2.3.1 Patched (2006-06-20 r38364) x86_64-unknown-linux-gnu attached base packages: [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" [7] "datasets" "base" other attached packages: Biobase "1.10.1" Martin -- Bioconductor

ADD COMMENT • link 17.7 years ago Martin Morgan 25k

0

Entering edit mode

Further to Martin's email, this code might be useful to you for what looks like your probe set information after normalization. Marcus tmp <- scan(what=character(0)) 56071 1052 1062 3061 3081 8052 8072 10061 10062 10072 1415670_at 8.430148 8.899385 8.625973 8.708319 8.759182 8.281378 8.905347 8.625347 9.029528 1415671_at 9.039655 9.244914 9.121714 9.002296 8.97237 8.599152 9.004381 9.267188 9.115415 1415672_at 8.86041 8.998826 9.077138 8.994297 8.885136 8.918512 9.087072 8.867808 8.841663 1415673_at 6.565344 6.384893 6.856466 6.17951 5.786523 6.507357 6.371563 5.886887 6.42499 1415674_a_at 7.877212 8.038635 8.120319 8.067843 7.56546 7.846677 7.921398 7.629843 7.787807 1415675_at 7.524559 7.496189 7.718928 7.164805 7.102158 7.331314 7.226036 7.424044 7.368011 1415676_a_at 9.315694 9.134394 9.224642 8.821193 8.886963 8.702572 8.883647 9.028728 8.921372 "" # Get and remove first 10 observations (look like slide IDs) SlideIDs <- LETTERS[1:9] tmp <- tmp[-(1:10)] # Index and get the annotation IDindex <- seq(1,length(tmp), by=10) probeIDs <- tmp[ IDindex ] # Construct a matrix of expressions expressions <- matrix(as.numeric(tmp[(!seq(tmp)%in%IDindex)]), nc=10-1, byrow=TRUE) # Check names ok rownames(expressions) <- probeIDs pd <- new("phenoData", pData=data.frame(Slide=1:(10-1), row.names=SlideIDs), varLabels=list(Slide="Slide identifiers")) eset <- new("exprSet", phenoData=pd, exprs=expressions) On 8/16/06 11:28 AM, "Martin Morgan" <mtmorgan at="" fhcrc.org=""> wrote: > swang <swang2000 at="" gmail.com=""> writes: > >> Dear List: >> >> I got a file like the following, I guess the data is M ( log2 expression >> ratio) from microarray: >> >> 56071 1052 1062 3061 3081 8052 8072 10061 10062 10072 1415670_at >> 8.430148 8.899385 8.625973 8.708319 8.759182 8.281378 8.905347 8.625347 > >> the rows are Affymetrix probe and columns are different mice number (arrays) >> I need to do a category analysis using category package, so I need to >> generate a MAList or ExprSet object. > > Starting with a data matrix > >> samples <- 3 >> sampleNames <- letters[1:samples] >> features <- 1000 >> ## raw data >> exprMatrix <- matrix(0, ncol=samples, > + nrow=features, > + dimnames=list(1:features, sampleNames)) > > To create an old-style exprSet (not sure what an ExprSet is, or which > package you mean by Category ;): > >> ## phenoData for exprSet >> pd2 <- new("phenoData", > + pData=data.frame(1:samples, > + row.names=sampleNames), > + varLabels=list(id="sample identifier")) >> new("exprSet", > + phenoData=pd2, > + exprs=exprMatrix) > Expression Set (exprSet) with > 1000 genes > 3 samples > phenoData object with 1 variables and 3 cases > varLabels > id: sample identifier > > To create an ExpressionSet (using this will require different commands > from the vignette that comes with Category) object: > >> ## phenoData for ExpressionSet >> pd1 <- new("AnnotatedDataFrame", > + data= > + data.frame(sampleId=1:samples, > + row.names=sampleNames), > + varMetadata= > + data.frame(labelDescription=I(c("Sample numeric identifier")), > + row.names=c("sampleId"))) >> new("ExpressionSet", > + phenoData=pd1, exprs=exprMatrix) > Instance of ExpressionSet > > assayData > Storage mode: lockedEnvironment > featureNames: 1, 2, 3, ..., 999, 1000 (1000 total) > Dimensions: > exprs > Rows 1000 > Samples 3 > > phenoData > sampleNames: a, b, c > varLabels and descriptions: > sampleId: Sample numeric identifier > > Experiment data > Experimenter name: > Laboratory: > Contact information: > Title: > URL: > PMIDs: > No abstract available. > > Annotation character(0) > > Much of the functionality of exprSet and ExpressionSet come from > associating phenoData with expression values; the skeletons above do > not have any meaningful phenoData. Typically you might incorporate > this by reading phenotypic data from a spreadsheet or tab-delimited > file (e.g., using read.table) into data.frames, and then incorporating > the data.frame into an ExpressionSet as outlined above. > > >> sessionInfo() > Version 2.3.1 Patched (2006-06-20 r38364) > x86_64-unknown-linux-gnu > > attached base packages: > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils" > [7] "datasets" "base" > > other attached packages: > Biobase > "1.10.1" > > > Martin ______________________________________________________ The contents of this e-mail are privileged and/or confidenti...{{dropped}}

ADD REPLY • link 17.7 years ago Marcus Davy ▴ 680

0

Entering edit mode

Francois Pepin ★ 1.3k

@francois-pepin-1012

Last seen 9.6 years ago

Hi Shiliang, As a note, I would point out that your data are almost certainly not expression ratio (although they do look like log2 values). Affy chips give absolute intensity levels, as opposed to the ratios that one generally gets from two-color arrays. The expression ratio (M values) would centered around 0 for the majority of genes which are not differentially expressed. The steps that Martin Morgan gave you to create exprSet give you the object you want, but you should really try to make sure that you understand properly what your data is. Otherwise, you will likely have serious problems interpreting the results at the end. Francois On Tue, 2006-08-15 at 14:07 -0400, swang wrote: > Dear List: > > I got a file like the following, I guess the data is M ( log2 expression > ratio) from microarray: > > 56071 1052 1062 3061 3081 8052 8072 10061 10062 10072 1415670_at > 8.430148 8.899385 8.625973 8.708319 8.759182 8.281378 8.905347 8.625347 > 9.029528 1415671_at 9.039655 9.244914 9.121714 9.002296 8.97237 8.599152 > 9.004381 9.267188 9.115415 1415672_at 8.86041 8.998826 9.077138 8.994297 > 8.885136 8.918512 9.087072 8.867808 8.841663 1415673_at 6.565344 6.384893 > 6.856466 6.17951 5.786523 6.507357 6.371563 5.886887 6.42499 1415674_a_at > 7.877212 8.038635 8.120319 8.067843 7.56546 7.846677 7.921398 7.629843 > 7.787807 1415675_at 7.524559 7.496189 7.718928 7.164805 7.102158 7.331314 > 7.226036 7.424044 7.368011 1415676_a_at 9.315694 9.134394 9.224642 8.821193 > 8.886963 8.702572 8.883647 9.028728 8.921372 > > the rows are Affymetrix probe and columns are different mice number (arrays) > I need to do a category analysis using category package, so I need to > generate a MAList or ExprSet object. > Is there anybody who can tell me how to do it? > > thanks > > Best > > Shiliang > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 17.7 years ago Francois Pepin ★ 1.3k

Login before adding your answer.