dear group,
how can expression data for a group of genes can be correlated to
survival covariate data using cox model and plot a kaplan-mier curve.
say i have subset of data from matrix MxN (M genes and N samples). I
take expression values for YxN (subset of M genes is Y and N are
collection of cancer and normal) and use recurrance time or survival
time and check if Y genes are sifnificant under cox model for
recurrance. if they are sifnificant plot them using kaplan-m curve. I
want to be able to use coxph and survh functions. I do not know how to
use both expression data and survival covariate data and see if set of
genes are sifnificant.
thanks
Adrian

Hi Adrian,
An example using the sample.ExpressionSet dataset.
## load packages
> library("survival")
> library("Biobase")
> data(sample.ExpressionSet)
## fake up some survival time - there are 26 observations
## let's say we have survival time =< 36 months for all patients
## with some amount of censoring
> surv.time <- Surv(sample(1:36, 26, replace=T), sample(0:1, 26,
replace=T))
> surv.time
[1] 6+ 35 17+ 18 11 35 15 15+ 7+ 14+ 31 12+ 15+ 1+ 14+ 24
30
19+ 8+
[20] 25+ 22 4+ 21+ 3 23 18+
## fit model with first gene
> mod <- coxph(surv.time~exprs(sample.ExpressionSet)[1,])
> summary(mod)
Call:
coxph(formula = surv.time ~ exprs(sample.ExpressionSet)[1, ])
n= 26
coef exp(coef) se(coef) z p
exprs(sample.ExpressionSet)[1, ] 0.00656 1.01 0.00782 0.839 0.4
exp(coef) exp(-coef) lower .95 upper
.95
exprs(sample.ExpressionSet)[1, ] 1.01 0.993 0.991
1.02
Rsquare= 0.026 (max possible= 0.793 )
Likelihood ratio test= 0.68 on 1 df, p=0.411
Wald test = 0.7 on 1 df, p=0.402
Score (logrank) test = 0.72 on 1 df, p=0.396
##OK, so not significant - let's plot anyway
> plot(survfit(mod))
You can just wrap this up in a call to apply to do all genes. In
addition, you could pull out the LR test statistic/p-value as a first
pass to see which genes are significant, and then go back and just
plot
those genes.
Best,
Jim
Adrian Johnson wrote:
> dear group,
> how can expression data for a group of genes can be correlated to
> survival covariate data using cox model and plot a kaplan-mier
curve.
> say i have subset of data from matrix MxN (M genes and N samples). I
> take expression values for YxN (subset of M genes is Y and N are
> collection of cancer and normal) and use recurrance time or survival
> time and check if Y genes are sifnificant under cox model for
> recurrance. if they are sifnificant plot them using kaplan-m curve.
I
> want to be able to use coxph and survh functions. I do not know how
to
> use both expression data and survival covariate data and see if set
of
> genes are sifnificant.
> thanks
> Adrian
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-5646
734-936-8662

Hi Adrian,
An example using the sample.ExpressionSet dataset.
## load packages
> library("survival")
> library("Biobase")
> data(sample.ExpressionSet)
## fake up some survival time - there are 26 observations
## let's say we have survival time =< 36 months for all patients
## with some amount of censoring
> surv.time <- Surv(sample(1:36, 26, replace=T), sample(0:1, 26,
replace=T))
> surv.time
[1] 6+ 35 17+ 18 11 35 15 15+ 7+ 14+ 31 12+ 15+ 1+ 14+ 24
30
19+ 8+
[20] 25+ 22 4+ 21+ 3 23 18+
## fit model with first gene
> mod <- coxph(surv.time~exprs(sample.ExpressionSet)[1,])
> summary(mod)
Call:
coxph(formula = surv.time ~ exprs(sample.ExpressionSet)[1, ])
n= 26
coef exp(coef) se(coef) z p
exprs(sample.ExpressionSet)[1, ] 0.00656 1.01 0.00782 0.839 0.4
exp(coef) exp(-coef) lower .95 upper
.95
exprs(sample.ExpressionSet)[1, ] 1.01 0.993 0.991
1.02
Rsquare= 0.026 (max possible= 0.793 )
Likelihood ratio test= 0.68 on 1 df, p=0.411
Wald test = 0.7 on 1 df, p=0.402
Score (logrank) test = 0.72 on 1 df, p=0.396
##OK, so not significant - let's plot anyway
> plot(survfit(mod))
You can just wrap this up in a call to apply to do all genes. In
addition, you could pull out the LR test statistic/p-value as a first
pass to see which genes are significant, and then go back and just
plot
those genes.
Best,
Jim
Adrian Johnson wrote:
> dear group,
> how can expression data for a group of genes can be correlated to
> survival covariate data using cox model and plot a kaplan-mier
curve.
> say i have subset of data from matrix MxN (M genes and N samples). I
> take expression values for YxN (subset of M genes is Y and N are
> collection of cancer and normal) and use recurrance time or survival
> time and check if Y genes are sifnificant under cox model for
> recurrance. if they are sifnificant plot them using kaplan-m curve.
I
> want to be able to use coxph and survh functions. I do not know how
to
> use both expression data and survival covariate data and see if set
of
> genes are sifnificant.
> thanks
> Adrian
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-5646
734-936-8662

On Fri, Nov 14, 2008 at 2:04 AM, Adrian Johnson
<oriolebaltimore@gmail.com>wrote:
> dear group,
> how can expression data for a group of genes can be correlated to
> survival covariate data using cox model and plot a kaplan-mier
curve.
> say i have subset of data from matrix MxN (M genes and N samples). I
> take expression values for YxN (subset of M genes is Y and N are
> collection of cancer and normal) and use recurrance time or survival
> time and check if Y genes are sifnificant under cox model for
> recurrance. if they are sifnificant plot them using kaplan-m curve.
I
> want to be able to use coxph and survh functions. I do not know how
to
> use both expression data and survival covariate data and see if set
of
> genes are sifnificant.
>
You might want to look at the survival package.
Sean
[[alternative HTML version deleted]]