Preprocessing of Human Gene 2.0 ST microarrays with oligo R package and annotation options
1
4
Entering edit mode
svlachavas ▴ 830
@svlachavas-7225
Last seen 5 months ago
Germany/Heidelberg/German Cancer Resear…

Dear Community,

I currently analyzing in R, a small number (6 samples-2 conditions-3 biological reps of each condition) of CEL files regarding Affymetrix Human Gene 2.0 ST arrays (for the first time this type of gene chip arrays). A relevant subset of my code is the following:

library(oligo)
library(affycoretools)
library(hugene20sttranscriptcluster.db)

library(limma)

librarypd.hugene.2.0.st)

setwd(mydir)

pdat <- read.table("pdat.project.txt",header=TRUE,stringsAsFactors = FALSE) # phenotype info

celfiles = list.celfiles()

affy.cels <- read.celfiles(celfiles)

identical(colnames(affy.cels),rownames(pdat)) # need to be identical for incorporate phenotype info

pd <- AnnotatedDataFrame(data= pdat)
phenoData(affy.cels) <- pd
celfiles.rma <- rma(affy.cels, target="probeset")

Thus, my main questions are the following:

1) For the rma function, which is the most valid/appropriate choise of target argment for gene ST arrays ? "probeset" or "core" ?

2) For removing the control probesets, i can use the function getMainProbes

3) To annotate in later steps of limma (i.e after topTable) my probesets/transcripts into gene symbols, i should first:

annotation(eset.rma) <- "hugene20sttranscriptcluster.db"

& then query the above db with functions select, etc ?

 

Thank you in advance !!

 

 

microarray oligo affycoretools pd.hugene.2.0.st hugene20sttranscriptcluster.db • 3.8k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States

1.) For the vast majority of users, the 'core' argument is the way to go. The Gene ST arrays are intended to measure transcript abundances, and the ability to summarize at the probeset level is really just due to the fact that they are based on the Exon ST platform.

2.) Yes. Why do you have doubts?

3.) If you are using affycoretools, it's easier to do

library(hugene20sttranscriptcluster.db)
eset.rma <- annotateEset(eset.rma, hugene20sttranscriptcluster.db)

And then your topTable will automatically contain annotation data.

ADD COMMENT
0
Entering edit mode

Dear James,

thank you for your confirmation !! actually, i tried (accidentally) the argument target=probeset with rma and then with getMainProbes, i ended with ~1700 features, which concerned me-and perhaps explains your comment about probeset level (but of course it is not the case when i use the "core" option). Thus, if i understood well, with the annotateEset function the returned annotation data are gene symbols, for matched transcripts, correct ?

ADD REPLY
1
Entering edit mode

The results are the Entrez Gene ID, symbol, and gene name, based on the Affy annotations for that array. We just take what Affy says each probeset measures, and then convert to a useful format without doing anything to check that what they say is correct in any sense.

Also, do note that there is a hugene20stprobeset.db package that annotates the probeset IDs, and that is what you would use to annotate if you summarize at the probeset level.

ADD REPLY
0
Entering edit mode

Than you again for your explanation-i will follow your advice and use the core argument-remove the control "transcripts" & annotation of the expression eset--perhaps the very small number of probesets when i use first "probeset" in rma and then getMainProbes, probably has to do with the design of the array.

ADD REPLY
0
Entering edit mode

Hi James:   

I get error:  could not find function "annotateEset"

ADD REPLY
2
Entering edit mode

Any time you see an error saying 'could not find function', it means you haven't loaded the package that contains that function yet. Or, it may mean that you are using an old version of R/Bioconductor where the function was not yet part of the package. You don't give the results of sessionInfo, so I can't say for sure, so try A) loading affycoretools first or B) using the current version of R/Bioconductor.

ADD REPLY

Login before adding your answer.

Traffic: 640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6