Are you using expresso
that way because it's what you want, or because that is what is in the help page? Normally you would just use rma
instead, as it's the de facto method for Affy arrays and is arguably the best method to use. If you plan to publish, saying you did something else is problematic because the reviewers are going to wonder why you didn't use rma
which you will then have to explain/defend.
Anyway, a couple of things. If you find yourself using the @
operator, you should rethink what you are doing. I mean it works, but it is not meant to be used directly like that. The main idea behind the various S4 methods in Bioconductor is to abstract away all the details of the underlying object so you don't have to mess with low level accessors or understand the underlying data structure. Usually there is a public method that uses the @
accessor for you, and which is meant to allow the underlying representation to change without requiring that you understand all the details. In this case, if you really need to add featureData
to your ExpressionSet
, you can simply use the featureData
accessor.
It's also probably easier to build an annotation package for your array, which will then allow you to easily annotate. You need the annotation csv file, which I assume you already have.
> df <- read.csv("Xcel.na36.annot.csv", comment.char = "#")
> probF <- df[,c("Probe.Set.ID","Entrez.Gene")]
> names(probF) <- c("probes","genes")
> makeChipPackage("xcel", probF, "org.Hs.eg.db", "0.0.1", "me <me@mine.org>", "me", ".", "9606", "Homo", "sapiens")
Populating genes table:
Populating map_metadata table:
probes table filled and indexed
table metadata filled
table metadata filled
Creating package in ./xcel.db
Now deleting temporary database file
[1] "./xcel.db"
> install.packages("xcel.db", repos = NULL, type = "source") ## only need type argument if on Windows, as I am
> library(xcel.db)
> abatch <- ReadAffy()
## here i am using some Xcel files I got from GEO
> eset <- rma(abatch)
> library(affycoretools)
> eset <- annotateEset(eset, xcel.db)
> head(fData(eset))
PROBEID ENTREZID SYMBOL
200000_s_at 200000_s_at 10594 PRPF8
200001_at 200001_at 826 CAPNS1
200002_at 200002_at 11224 RPL35
200003_s_at 200003_s_at 6158 RPL28
200004_at 200004_at 1982 EIF4G2
200005_at 200005_at 8664 EIF3D
GENENAME
200000_s_at pre-mRNA processing factor 8
200001_at calpain small subunit 1
200002_at ribosomal protein L35
200003_s_at ribosomal protein L28
200004_at eukaryotic translation initiation factor 4 gamma 2
200005_at eukaryotic translation initiation factor 3 subunit D
Unless you show code it's impossible to help you.
But in general you should not be making separate
AnnotatedDataFrame
s and trying to put them into anAffyBatch
. You might want to annotate after running RMA, but usually not before.\\raw.data <- ReadAffy(celfile.path = celpath, phenoData = AnnotatedDataFrame)
raw.data@featureData <- new("AnnotatedDataFrame", data = annotate)
eset <- expresso(raw.data, bgcorrect.method = "rma", normalize = "constant", summary.method = "avgdiff", pmcorrect.method = "pmonly")
This is the code i have been using. Sorry for not including before.
Sorry i should have said i am also using the makecdfenv function as per
make.cdf.package("Xcel", packagename = "xcelcdf", cdf.path = "path", package.path = , species = "Homo_sapiens") cdf <- read.cdffile("path")
in addition, this runs fine with no errors but it does not give me data that is gene expression data but still probe expression data. What am i doing wrong here so that i does summarise the probesets into gene level data?