Entering edit mode
Jillian Rowe
▴
20
@jillian-rowe-4371
Last seen 10.2 years ago
Hello all,
I am very new to R, and am confused with the GSEA package.
I am running an analysis on a group of CEL files from two treatment
groups.
(I'm sorry this is so long. I just wanted to make sure there were lots
of
comments and no block o' text)
# loads all required libraries for this assignment
source("
http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/cluste
rIndex.R<http: faculty.ucr.edu="" %7etgirke="" documents="" r_biocond="" my_r_scr="" ipts="" clusterindex.r="">
")
source("
http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/my.col
orFct.R<http: faculty.ucr.edu="" %7etgirke="" documents="" r_biocond="" my_r_scri="" pts="" my.colorfct.r="">
")
source("
http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/dendro
Col.R<http: faculty.ucr.edu="" %7etgirke="" documents="" r_biocond="" my_r_script="" s="" dendrocol.r="">
")
## Libraries
library( hu6800cdf )
library( hu6800.db )
library( affy )
library( genefilter )
library( multtest )
library( siggenes )
library( annaffy )
library(affyPLM)
library(multtest)
library(cluster)
library(pvclust)
library(gplots)
library(ALL)
library(hgu95av2.db)
library(KEGG.db)
library(GSEABase)
library(affyPLM)
library(GO.db)
#set the working directory to ensure R is pointing to correct directiy
to
read the CEL files
setwd("/home/jill/Desktop/Microarray/Assignments/CEL")
#read the CEL files separated by semicolon with hearder into
#pd by read.annotated dataframe method
pd <- read.AnnotatedDataFrame( "treat.txt", header=T, row.names=1,
sep=";" )
# verify to see how your phonodata ( sample information )
pData(pd)
##Diabetic
##GSM391693.CEL a
##GSM391694.CEL a
##GSM391695.CEL a
##GSM391696.CEL a
##GSM391697.CEL a
##GSM391702.CEL b
##GSM391703.CEL b
##GSM391704.CEL b
##GSM391705.CEL b
##GSM391706.CEL b
# read the files using ReadAffy
# if you do not supply any argument , it will read all CEL files from
your
working directory
expression_data <- ReadAffy( filenames = rownames (pData(pd)) )
# QC
library(affyPLM);
dataPLM = fitPLM(expression_data);
pdf("nuse_plot.pdf");
boxplot(dataPLM, main="NUSE", ylim=c(0.95,1.1), outline= FALSE,
col="lightblue", las=3, whisklty = 0 , staplelty = 0);
dev.off()
##pdf 2
# performs GCRMA background correct ,normalization and summarization
esetGCrma <- justRMA(filenames=rownames(pData(pd)))
# removes genes from the RMA normalized sets with an IQR greater than
0.5
selector<-function(x) (IQR(x)> 0.5);
a1 <- filterfun(selector);
gcrma_filtered<- genefilter(esetGCrma, a1);
sum(gcrma_filtered);
##[1] 2985
gcrma_data_selected <- esetGCrma[gcrma_filtered,];
cl <-as.numeric(pd$Diabetic=="a");
# nonspecific filter: remove genes that does not
## show much variation across samples
#Load KEGG and GSEABase package.
#Do a GeneSetCollection for KEGG.
#Construct an incidence matrix.
####End of working code!
Now, what I think I should do is:
gsc <- GeneSetCollection(gcrma_data_selected,
setType=KEGGCollection())
But I get this error:
Error in as.list(getAnnMap("PATH2PROBE", annotation(idType))) :
error in evaluating the argument 'x' in selecting a method for
function
'as.list'
I have some sample data that uses data from the ALL library, and it
inputs a
data object from the genefilter function as well.
Any help would be very, very appreciated.
Thanks!
Jillian
[[alternative HTML version deleted]]