I'm new to R and genetics and I am trying to processes some gene expression data. So I can download some cel.gz files and then I can put them in a pipeline to get a gene expression file for that cel.gz file. I am using an high performance computing cluster so it isn't too easy to install different versions of R or bioconductor (especially newer ones), so I am working with our latest versions which are R/3.3.2 and Bioconductor 3.4.
So say I have a file GSM3921.cel.gz. In my script I do:
library(affy) library(frma) abatch <- ReadAffy(filenames="GSM3921.cel.gz") aobj <- frma(abatch) aeset <- exprs(aobj)
This gives me the expression values for the probes. However, what I really want to do is convert these probe values to entrez values. It looks like once upon a time ago you could simply do:
library(affy) library(frma) library(hgu133afrmavecs)
library(hgu133ahsentrezgcdf)abatch <- ReadAffy(filenames="GSM3921.cel.gz",cdfname="HGU133A_HS_ENTREZG") aobj <- frma(abatch,input.vecs=hgu133ahsentrezgfrmavecs) aeset <- exprs(aobj)
However, when I try to install
library(hgu133ahsentrezgcdf) I get an error saying this package is not available for R/3.3.2. I can't really find any documentation on this package. Is there a way to overcome this installation problem, or is there any other way to convert the expression values to entrez? I also need this to work in a similar way for HGU133PLUS2 files as well.
One more note, is that when reading posts about this is seems like anyone using
library(hgu133ahsentrezgcdf) is doing so with R/3.0.2 or older. I tried downloading this library with R/3.0.1 and I got the same this package is not available for R/3.3.2 error, so maybe I'm not downloading it correctly? To download it I do:
source("http://bioconductor.org/biocLite.R") biocLite() biocLite("hgu133ahsentrezgcdf")
Any help getting this entrez conversion figured out would be great.