Question: Importing and Extracting Annotation
gravatar for atiqahrahman
3.7 years ago by
atiqahrahman0 wrote:

Hi all!

I'm new with R and amd currently working on data from ZebGene-1_0-st arrays. However I am having problem doing the annotations as firstly there is no package in bioconductor and secondly, the sample workflow that I found for the array does not yield a true sanity check/identical. I realised that the workflow below does not extract and reorder to match my probes. Any advice to overcome this problem helps! Thank you in advance :)


# Import the annotations
dat <- read.csv(file.path(metaDir, "ZebGene-1_0-st-v1.na33.3.zv9.transcript.csv"), comment.char = "#", stringsAsFactors=FALSE, na.string = "---")
dat <- col2rownames(dat, "probeset_id")
#extract and reorder to match the array features
dat <- dat[row.names(fData(affyNorm.batch)),]
dat <- dat[,c("probeset_id", "seqname", "strand", "start", "stop", "gene_assignment", "mrna_assignment")]
dat <- as.matrix(dat)
# parse mrna_assignments
headercol <- "mrna_assignment"
mrnas <- t(sapply(strsplit(dat[, headercol], " /// "), function(x) {
  dat.probe.df <-, strsplit(x, " // "))
  bestrna <- dat.probe.df[1,1]
  rnas <- paste(dat.probe.df[,1], collapse=",")
  c(bestrna, rnas)
mrnas <-
names(mrnas) <- c("best.mrna", "mrnas")
# parse gene assignments
headercol <- "gene_assignment"
genes <- t(sapply(strsplit(dat[, headercol], " /// "), function(x) {
    out <- rep("NA", 6)
    } else {
      dat.probe.mat <- as.matrix(, strsplit(x, " // ")))
      bestgene <- as.character(dat.probe.mat[1,1])
      dat.probe.vec <- apply(dat.probe.mat, 2, function(y) {
        paste(unique(y), collapse=",")
      out <- as.character(c(bestgene,dat.probe.vec))

genes <-[,c(1,2,3,4,6)])
names(genes) <- c("bestgene", "accessions", "symbols", "descriptions", "entrezIDs")
genes <- rownames2col(genes, "probeids")
#combo mrna and gene assigments
gene.annots <- cbind(genes, mrnas)
annotation zebrafish probe • 602 views
ADD COMMENTlink modified 3.7 years ago by James W. MacDonald52k • written 3.7 years ago by atiqahrahman0
Answer: Importing and Extracting Annotation
gravatar for James W. MacDonald
3.7 years ago by
United States
James W. MacDonald52k wrote:

This isn't really a good question for this site, as it is only tangentially related to Bioconductor packages, and has more to do with R coding and whatnot. And that sort of thing is IMO better learned by seeing how others have tackled similar problems and emulating what you think is reasonable.

So please note that I have very similar functionality in the devel version of affycoretools that you can see here (you want to look at .dataFromNetaffx). I would also point out a couple of things. First, the pdInfoPackage already comes with a parsed version of the annotation csv file that you can access using getNetAffx from the oligo package (which will already be loaded and available to you). Second, if you put the results into the featureData slot of your ExpressionSet, you can run validObject to make sure things line up correctly. That's a good validity check, plus the featureData slot will propagate through the limma package and end up in your topTable if you analyze your data using limma (which IMO you should).


ADD COMMENTlink written 3.7 years ago by James W. MacDonald52k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 283 users visited in the last hour