Entering edit mode
Ahmet ZEHIR
▴
50
@ahmet-zehir-5159
Last seen 10.2 years ago
Hello,
I've been using the survival package to determine whether a gene's
expression can be used in prediction of survival probabilities a while
now
using the CDGSR package to pull TCGA data. However, now that I've been
thinking about using multiple genes' expression to stratify patients,
I'm a
bit stuck.
I've written some code in R to do just that based on a paper I've read
but
can't remember the name anymore where the basic idea is to score each
patient 1 or -1 based on whether they express a gene more than the
mean
expression of that gene in the population. Afterwards, the scores are
added
and each patient end up with a PIR score (I remember the original
paper
I've read used the PIR score but can't remember what it stands for
now,
I've just kept it PIR in case I run into that paper again), I then
stratify
the patients based on the median of PIR scores and perform the
survival
analysis.
Would anyone recommend a better way of doing things? I'm pasting the
code
below - which is probably very crude as I'm not too experienced with R
yet.
Any suggestion/critique/idea would be greatly appreciated.
library(cgdsr, survival, reshape, plyr)
expression <- data.frame(rep(1, 563)) #Create a dataframe to put the
date in
for (i in 1:length(chr.list)) { #chr.list is a list of genes
x <- getProfileData(mycgds, chr.list[i], "ov_tcga_mrna_median",
"ov_tcga_all")
if(length(table(x)) != 0 ) { #Check to make sure the gene is
represented
and expression values are obtained
expression <- cbind(expression, x)
}
} #get the expression data from TCGA database
d.frame <- na.omit(expression[, 2:length(expression)]) #Remove missing
values
d.frame$patient <- rownames(d.frame) # add a column with patient IDs
d.frame.melt <- melt(d.frame, id = "patient") # melt the data
d.frame.melt <- ddply(d.frame.melt, .(variable), transform, mean =
mean(value)) # create a new column, with mean value for each gene in
the
group of patients
d.frame.melt <- ddply(d.frame.melt, .(patient), transform, PIR =
ifelse(value > mean, 1, -1)) #For each gene, if the expression is
higher
td.framen the mean, then the PIR score is 1, else it is -1
d.frame.PIR <- ddply(d.frame.melt, .(patient), colwise(sum, "PIR")) #
Sum
the PIR values
d.frame.PIR <- ddply(d.frame.PIR, .(patient), transform, survival =
ifelse(PIR > median(d.frame.PIR$PIR), 1, 0)) #stratify patients based
on
the PIR value
clinical <- getClinicalData(mycgds, "ov_tcga_all") #Get clinical data
clinical$patient <- rownames(clinical)
d.frame.PIR <- join(d.frame.PIR, clinical, type = "inner") #merge
clinical
information with PIR stratified patient data frame
d.frame.PIR$status <- d.frame.PIR$overall_survival_status=="DECEASED"
d.frame.surv <- Surv(d.frame.PIR$overall_survival_months,
d.frame.PIR$status) # run the survival analysis and draw a graph
d.frame.diff <- survdiff(d.frame.surv ~ d.frame.PIR$survival)
d.frame.fit <- survfit(d.frame.surv ~ d.frame.PIR$survival)
p <- 1-pchisq(d.frame.diff$chisq, length(d.frame.diff$n) -1)
plot(d.frame.fit, lty=1:2, mark.time=F)
legend(10, .2, c("HOXB1-13 high expression", "HOXB1-13 low
expression"),
lty = 1:2)
title(main=paste("Survival Probablity of TCGA Ovarian Patients based
on
HOXB Cluster (HOXB1-13) expression (p = ", round(p, digits = 4), ")",
sep=""), xlab="Months", ylab="Probability")
> sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/en_US.UTF-8/C/C/C/C
attached base packages:
[1] splines graphics grDevices utils datasets grid stats
methods base
other attached packages:
[1] survival_2.36-12 reshape2_1.2.1 reshape_0.8.4
plyr_1.7.1
cgdsr_1.1.19 R.oo_1.8.3 R.methodsS3_1.2.2
RSQLite_0.11.1
[9] DBI_0.2-5 knitr_0.3 gplots_2.10.1
KernSmooth_2.23-7 caTools_1.12 bitops_1.0-4.1 gdata_2.8.2
gtools_2.6.2
[17] RColorBrewer_1.0-5 ggplot2_0.9.0
loaded via a namespace (and not attached):
[1] MASS_7.3-17 Rcpp_0.9.10 codetools_0.2-8
colorspace_1.1-1
dichromat_1.2-4 digest_0.5.1 evaluate_0.4.1 formatR_0.3-4
highlight_0.3.1
[10] memoise_0.1 munsell_0.3 parser_0.0-14 proto_0.3-9.2
scales_0.2.0 stringr_0.6 tools_2.14.2
--
*Ahmet Z.*
[[alternative HTML version deleted]]