Question: survival prediction from multiple genes
gravatar for Ahmet ZEHIR
7.7 years ago by
Ahmet ZEHIR50
Ahmet ZEHIR50 wrote:
Hello, I've been using the survival package to determine whether a gene's expression can be used in prediction of survival probabilities a while now using the CDGSR package to pull TCGA data. However, now that I've been thinking about using multiple genes' expression to stratify patients, I'm a bit stuck. I've written some code in R to do just that based on a paper I've read but can't remember the name anymore where the basic idea is to score each patient 1 or -1 based on whether they express a gene more than the mean expression of that gene in the population. Afterwards, the scores are added and each patient end up with a PIR score (I remember the original paper I've read used the PIR score but can't remember what it stands for now, I've just kept it PIR in case I run into that paper again), I then stratify the patients based on the median of PIR scores and perform the survival analysis. Would anyone recommend a better way of doing things? I'm pasting the code below - which is probably very crude as I'm not too experienced with R yet. Any suggestion/critique/idea would be greatly appreciated. library(cgdsr, survival, reshape, plyr) expression <- data.frame(rep(1, 563)) #Create a dataframe to put the date in for (i in 1:length(chr.list)) { #chr.list is a list of genes x <- getProfileData(mycgds, chr.list[i], "ov_tcga_mrna_median", "ov_tcga_all") if(length(table(x)) != 0 ) { #Check to make sure the gene is represented and expression values are obtained expression <- cbind(expression, x) } } #get the expression data from TCGA database d.frame <- na.omit(expression[, 2:length(expression)]) #Remove missing values d.frame$patient <- rownames(d.frame) # add a column with patient IDs d.frame.melt <- melt(d.frame, id = "patient") # melt the data d.frame.melt <- ddply(d.frame.melt, .(variable), transform, mean = mean(value)) # create a new column, with mean value for each gene in the group of patients d.frame.melt <- ddply(d.frame.melt, .(patient), transform, PIR = ifelse(value > mean, 1, -1)) #For each gene, if the expression is higher td.framen the mean, then the PIR score is 1, else it is -1 d.frame.PIR <- ddply(d.frame.melt, .(patient), colwise(sum, "PIR")) # Sum the PIR values d.frame.PIR <- ddply(d.frame.PIR, .(patient), transform, survival = ifelse(PIR > median(d.frame.PIR$PIR), 1, 0)) #stratify patients based on the PIR value clinical <- getClinicalData(mycgds, "ov_tcga_all") #Get clinical data clinical$patient <- rownames(clinical) d.frame.PIR <- join(d.frame.PIR, clinical, type = "inner") #merge clinical information with PIR stratified patient data frame d.frame.PIR$status <- d.frame.PIR$overall_survival_status=="DECEASED" d.frame.surv <- Surv(d.frame.PIR$overall_survival_months, d.frame.PIR$status) # run the survival analysis and draw a graph d.frame.diff <- survdiff(d.frame.surv ~ d.frame.PIR$survival) <- survfit(d.frame.surv ~ d.frame.PIR$survival) p <- 1-pchisq(d.frame.diff$chisq, length(d.frame.diff$n) -1) plot(, lty=1:2, mark.time=F) legend(10, .2, c("HOXB1-13 high expression", "HOXB1-13 low expression"), lty = 1:2) title(main=paste("Survival Probablity of TCGA Ovarian Patients based on HOXB Cluster (HOXB1-13) expression (p = ", round(p, digits = 4), ")", sep=""), xlab="Months", ylab="Probability") > sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C/en_US.UTF-8/C/C/C/C attached base packages: [1] splines graphics grDevices utils datasets grid stats methods base other attached packages: [1] survival_2.36-12 reshape2_1.2.1 reshape_0.8.4 plyr_1.7.1 cgdsr_1.1.19 R.oo_1.8.3 R.methodsS3_1.2.2 RSQLite_0.11.1 [9] DBI_0.2-5 knitr_0.3 gplots_2.10.1 KernSmooth_2.23-7 caTools_1.12 bitops_1.0-4.1 gdata_2.8.2 gtools_2.6.2 [17] RColorBrewer_1.0-5 ggplot2_0.9.0 loaded via a namespace (and not attached): [1] MASS_7.3-17 Rcpp_0.9.10 codetools_0.2-8 colorspace_1.1-1 dichromat_1.2-4 digest_0.5.1 evaluate_0.4.1 formatR_0.3-4 highlight_0.3.1 [10] memoise_0.1 munsell_0.3 parser_0.0-14 proto_0.3-9.2 scales_0.2.0 stringr_0.6 tools_2.14.2 -- *Ahmet Z.* [[alternative HTML version deleted]]
survival ovarian • 1.5k views
ADD COMMENTlink written 7.7 years ago by Ahmet ZEHIR50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 396 users visited in the last hour