Entering edit mode
Torbjörn Klatt
▴
10
@torbjorn-klatt-3877
Last seen 10.3 years ago
Hello Bioconductor followers,
I'm quite new to R and Bioconductor and as a student in biomathematics
and
attending a paper of computational biology I'm asked to identify
significantly
expressed genes by a SAM analysis. We are working on the raw data set
provided
by Wang [1] at GEO [2] and try do reproduce their procedure and
analysis for
practical training.
For our first attempts we worked on one of the university's computers
running
MS Windows XP Professional 32bit with SP3 and R 2.9.2 (2009-08-24). By
reading
the help file for the sam() command we experimented a little bit and
used the
attributes "delta" and "p0" amongst others. It worked well and we were
rather
satisfied with the results.
Afterwards I installed R and Bioconductor on my own computer at home
running
openSUSE 11.2 (64bit) with kernel 2.6.31.5-0.1 and R 2.10.1
(2009-12-14) with
additional Bioconductor packages installed directly from the
repository on
2010/01/05. I configured and compiled R by myself using gcc 4.4.1 [3]
and it's
complementary f77 [4] with no additional parameters.
I tried to run the same script we wrote in the university's computers
and it
broke with some error messages.
One of the errors was caused by the rma() command. This one could be
fixed by
my professor. It seemed that there have been some changes in the affy
package.
The fixed command (see file sam_analysis.r further down) is working on
the old
university computers as well.
But the sam() command still did not work with the parameters "delta"
and "p0"
and I got a error message saying that there are "unused parameters
'delta',
'p0'.
Although we could find a way without using these I would like to know
since
when and why these parameters are not longer supported by sam(). I
could not
find a very detailed change log of the siggenes package except of this
one
(http://fgc.lsi.umich.edu/cgi-bin/blosxom.cgi/siggenes), so I'm
writing this
email.
Cheers,
Torbjoern Klatt
PS: If this email is getting through to the list, it seems not
possible to
send PGP-signed mails. Is that right?
--- file: sam_analysis.r ---
#################################################
# R-Script
# author: Torbjoern Klatt
# subject: Bioinformatik
# project: SAM analysis
# date created: 2010-01-05
# date edited last: 2010-01-06
#################################################
# set the working directory
setwd("/home/myself/Documents/Wissen/Uni/rac/0910_ws/Bioinformatik/Pra
ktikum/wd")
# load required libraries
library(affy) # for Affymetrix chips
library(siggenes) # identifying significant genes
library(hgu133a2.db) # to map the affy probe names to gene names
library(hopach) # clustering
# read data about the phenotype (here dasatinib sensitivity)
cell.lines <-
read.csv("sensitivity.csv",row.names=1,header=TRUE,sep=",")
pheno <- as.data.frame(cell.lines[,2],row.names=row.names(cell.lines))
names(pheno) <- c("sensitivity")
# read the cell files and assign phenotype information
wangData <- ReadAffy(phenoData=pheno)
## old version of previous line
### wangData <- ReadAffy()
### wangData at phenoData<-as(pheno, "AnnotatedDataFrame")
# background correction, generation of expression values and
normalization
wangExpr <- rma(wangData)
# look at the expression data
sampleNames(wangExpr)
featureNames(wangExpr)
description(wangExpr)
pData(phenoData(wangExpr))
dim(exprs(wangExpr))
head(exprs(wangExpr))
# here one should do some more quality control, but this is omitted
for now
# add the analysis here
# with a 'rand' value of '123' the p0 in the SAM analysis will be 0.5
sam.out <- sam(exprs(wangExpr), pData(phenoData(wangExpr))[,1],
method=d.stat,
B=500, rand=123)
## old version of previous line
### sam.out <- sam(exprs(wangExpr), pData(phenoData(wangExpr))[,1],
method=d.stat, delta=seq(from=1.0 to=2.0 by=0.1), p0=0.5,B=500)
delta <- findDelta(sam.out,fdr=0.05)
genes <- list.siggenes(sam.out,delta[1,1])
--- END: file---
--- sessionInfo() on my linux machine ---
R version 2.10.1 (2009-12-14)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] splines stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] hgu133a2cdf_2.5.0 hopach_2.6.0 cluster_1.12.1
[4] hgu133a2.db_2.3.5 org.Hs.eg.db_2.3.6 RSQLite_0.8-0
[7] DBI_0.2-5 AnnotationDbi_1.8.1 siggenes_1.20.0
[10] multtest_2.2.0 affy_1.24.2 Biobase_2.6.1
loaded via a namespace (and not attached):
[1] affyio_1.14.0 MASS_7.3-5 preprocessCore_1.8.0
[4] survival_2.35-8 tools_2.10.1
--- END: sessionInfo() ---
--- sessionInfo() on the WinXP machine ---
R version 2.9.2 (2009-08-24)
i386-pc-mingw32
locale:
LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETAR
Y=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
attached base packages:
[1] splines stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] hopach_2.6.0 cluster_1.12.0 hgu133a2.db_2.2.11
[4] RSQLite_0.8-0 DBI_0.2-5 AnnotationDbi_1.6.1
[7] siggenes_1.18.0 multtest_2.2.0 affy_1.22.1
[10] Biobase_2.4.1
loaded via a namespace (and not attached):
[1] affyio_1.12.0 MASS_7.2-48 preprocessCore_1.6.0
[4] survival_2.35-4
--- END: sessionInfo()
--- references ---
[1] Xi-De Wang, Karen Reeves, Feng R Luo, Li-An Xu, Francis Lee, Edwin
Clark,
Fei Huang. (2007). Identification of candidate predictive and
surrogate
molecular markers for dasatinib in prostate cancer: rationale for
patient
selection and efficacy monitoring. Genome biology 8 (11) p. R255
http://www.ncbi.nlm.nih.gov/pubmed/18047674
[2] http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE9633
[3] extracted from config.log: gcc version 4.4.1 [gcc-4_4-branch
revision
150839] (SUSE Linux)
[4] extracted from config.log: GNU Fortran (SUSE Linux) 4.4.1 [gcc-
4_4-branch
revision 150839]