Limma: Nimblegen array data import?

0

Entering edit mode

Dave Berger ▴ 30

@dave-berger-2444

Last seen 9.6 years ago

previously, I have used limma for 2 colour array analysis. I now have a new data set from Nimblegen arrays in which RMA normalization has been completed and I wish to identify differentially expressed genes from the allcalls.txt file which is a table of expression values for all the treatments in one file Question: if I wish to do a single channel analysis in "limma", I would appreciate suggestions on importing this data ie. how do I convert the data to an "exprSet" object? thanks Dave Berger This message and attachments are subject to a disclaimer. Please refer to www.it.up.ac.za/documentation/governance/disclaimer/ for full details. / Hierdie boodskap en aanhangsels is aan 'n vrywaringsklousule onderhewig. Volledige besonderhede is by www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar.

limma convert limma convert • 1.2k views

ADD COMMENT • link updated 16.4 years ago by Martin Morgan 25k • written 16.4 years ago by Dave Berger ▴ 30

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 7 hours ago

United States

Mark, Dave -- It's true that lmFit works with a basic matrix, but it's not too hard to create an ExpressionSet 1. Read in the expression data (you'll have to do this anyway): > dataDirectory <- system.file("extdata", package="Biobase") > exprsFile <- file.path(dataDirectory, "exprsData.txt") > exprs <- as.matrix(read.table(exprsFile, header=TRUE, sep="\t", + row.names=1, + as.is=TRUE)) then create an ExpressionSet > library(Biobase) > mySet1 <- new("ExpressionSet", exprs=exprs) That's it. Why bother? Because you can now start to let the software do your work for you, reducing errors and improving reproducibility. For instance, almost all experiments have data that describe the phenotypes ('phenotypes' broadly defined to mean characteristics, phenotypic, genetic, or otherwise) of the samples. Here's some phenotypic data that we can use to capture the sample description as an AnnotatedDataFrame 2. Read in phenotypic data > pDataFile <- file.path(dataDirectory, "pData.txt") > pData <- read.table(pDataFile, + row.names=1, header=TRUE, sep="\t") 3. Create an annotated data frame > phenoData <- new("AnnotatedDataFrame", data=pData) We can then add that to the existing ExpressionSet > phenoData(mySet1) <- phenoData or, if we've thought ahead, create an ExpressionSet directly > mySet2 <- new("ExpressionSet", exprs=exprs, phenoData=phenoData) Why is this helpful? It coordinates the sample and phenotype information, so e.g., if we subset the samples, we also subset the relevant phenoData > dim(mySet2) Features Samples 500 26 > dim(mySet2[,mySet2$gender=="Male"]) Features Samples 500 15 It also helps us to avoid, e.g., mismatches between sample and phenotype data: > badData <- pData[sample(rownames(pData), nrow(pData)),] > badPhenoData <- new("AnnotatedDataFrame", data=badData) > mySet3 <- new("ExpressionSet", exprs=exprs, phenoData=badPhenoData) Error in validObject(.Object) : invalid class "ExpressionSet" object: sampleNames differ between assayData and phenoData Coordinating expression and phenotype data, and avoiding subtle errors, seem like good reasons to start down the ExpressionSet road; there's a more comprehensive introduction in the Biobase vignette 'An introduction to Biobase and ExpressionSets' > openVignette() Please select a vignette: 1: Biobase - An introduction to Biobase and ExpressionSets 2: Biobase - Bioconductor Overview 3: Biobase - esApply Introduction 4: Biobase - Notes for eSet developers 5: Biobase - Notes for writing introductory 'how to' documents 6: Biobase - quick views of eSet instances 7: limma - Limma Vignette Selection: 1 Martin Mark Robinson <mrobinson at="" wehi.edu.au=""> writes: > How about just running 'limma' on the table of normalized expression > values? > > In ?lmFit, the object which gets operated on doesn't have to be an > exprSet. > > M. > > On 29/11/2007, at 2:52 PM, Dave Berger wrote: > >> previously, I have used limma for 2 colour array analysis. >> I now have a new data set from Nimblegen arrays in which RMA >> normalization has been completed and I wish to identify differentially >> expressed genes from the allcalls.txt file which is a table of >> expression values for all the treatments in one file >> Question: >> if I wish to do a single channel analysis in "limma", I would >> appreciate suggestions on importing this data ie. how do I convert the >> data to an "exprSet" object? >> >> thanks >> Dave Berger >> >> >> This message and attachments are subject to a disclaimer. Please refer >> to www.it.up.ac.za/documentation/governance/disclaimer/ for full >> details. / Hierdie boodskap en aanhangsels is aan 'n >> vrywaringsklousule >> onderhewig. Volledige besonderhede is by >> www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/ >> gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Dr. Martin Morgan, PhD Computational Biology Shared Resource Director Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793

ADD COMMENT • link 16.4 years ago Martin Morgan 25k

0

Entering edit mode

Mark Robinson ★ 1.1k

@mark-robinson-2171

Last seen 9.6 years ago

How about just running 'limma' on the table of normalized expression values? In ?lmFit, the object which gets operated on doesn't have to be an exprSet. M. On 29/11/2007, at 2:52 PM, Dave Berger wrote: > previously, I have used limma for 2 colour array analysis. > I now have a new data set from Nimblegen arrays in which RMA > normalization has been completed and I wish to identify differentially > expressed genes from the allcalls.txt file which is a table of > expression values for all the treatments in one file > Question: > if I wish to do a single channel analysis in "limma", I would > appreciate suggestions on importing this data ie. how do I convert the > data to an "exprSet" object? > > thanks > Dave Berger > > > This message and attachments are subject to a disclaimer. Please refer > to www.it.up.ac.za/documentation/governance/disclaimer/ for full > details. / Hierdie boodskap en aanhangsels is aan 'n > vrywaringsklousule > onderhewig. Volledige besonderhede is by > www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

ADD COMMENT • link 16.4 years ago Mark Robinson ★ 1.1k

0

Entering edit mode

Mark Robinson ★ 1.1k

@mark-robinson-2171

Last seen 9.6 years ago

Dave, I encourage you to read the limma documentation. Cut and paste from a not so recent version of the limma user's guide (though it may be the same as the recent one, i haven't checked): > design <- model.matrix(~ -1+factor(c(1,1,1,2,2,3,3,3))) > colnames(design) <- c("group1", "group2", "group3") > fit <- lmFit(eset, design) In your case, just replace the 'eset' with your table of intensities and modify the bit in the 'model.matrix' command to match the columns to your situation. You could probably define a targets file also. Cheers, Mark On 29/11/2007, at 3:12 PM, Dave Berger wrote: > Hi Mark > how do I define for limma which column is which treatment? - do I > need to define this using a targets file? > thanks > Dave > > Quoting Mark Robinson <mrobinson at="" wehi.edu.au="">: > >> >> How about just running 'limma' on the table of normalized >> expression values? >> >> In ?lmFit, the object which gets operated on doesn't have to be an >> exprSet. >> >> M. >> >> On 29/11/2007, at 2:52 PM, Dave Berger wrote: >> >>> previously, I have used limma for 2 colour array analysis. >>> I now have a new data set from Nimblegen arrays in which RMA >>> normalization has been completed and I wish to identify >>> differentially >>> expressed genes from the allcalls.txt file which is a table of >>> expression values for all the treatments in one file >>> Question: >>> if I wish to do a single channel analysis in "limma", I would >>> appreciate suggestions on importing this data ie. how do I >>> convert the >>> data to an "exprSet" object? >>> >>> thanks >>> Dave Berger >>> >>> >>> This message and attachments are subject to a disclaimer. Please >>> refer >>> to www.it.up.ac.za/documentation/governance/disclaimer/ for full >>> details. / Hierdie boodskap en aanhangsels is aan 'n >>> vrywaringsklousule >>> onderhewig. Volledige besonderhede is by >>> www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/ >>> gmane.science.biology.informatics.conductor > > > > -- > ~~ > Dave Berger, PhD > Professor, Plant Science > Room 6-26,Agricultural Sciences Building > Lunnon Rd > University of Pretoria > Pretoria 0002 > South Africa > Phone: +27-12-420 4634 / 4239 > Fax: +27-12-420 3947 > http://www.fabinet.up.ac.za/mppi/index > > This message and attachments are subject to a disclaimer. Please refer > to www.it.up.ac.za/documentation/governance/disclaimer/ for full > details. / Hierdie boodskap en aanhangsels is aan 'n > vrywaringsklousule > onderhewig. Volledige besonderhede is by > www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar. > >

ADD COMMENT • link 16.4 years ago Mark Robinson ★ 1.1k

Login before adding your answer.