Hello everyone!
I used the RTCGA package for survival analyis in different types of cancers from TCGA. I stratified the patients based on high or low expression of two genes. By using the code generated by MarcinKosinski I obtained nice survival curves (https://github.com/RTCGA/RTCGA/issues/97#categorize-variables). The R code he suggests is immediate and very useful!
Now I would like to stratify the patients based on the different breast cancer subtypes (Luminal A, Luminal B etc....) to see if the genes I considered affect the survival of particular subtypes.
Do you have any suggestions for this?
How to get clinical and RNA data from each subtype?
Best,
Giulia
Hello @giulia_m , can you provide the code, so we can update it with your request?
Here is the code I used:
library(RTCGA.clinical)
library(RTCGA.rnaseq)
library(tidyverse)
library(survmnier)
survivalTCGA(BRCA.clinical) -> BRCA.surv
expressionsTCGA(
BRCA.rnaseq,
extract.cols = c("CDCA2|157313", "AURKB|9212")
) -> BRCA.rnaseq
dim(BRCA.surv); dim(BRCA.rnaseq)
head(BRCA.surv); head(BRCA.rnaseq)
BRCA.rnaseq <- BRCA.rnaseq %>%
rename(cohort = dataset,
CDCA2 = `CDCA2|157313`,
AURKB = `AURKB|9212`) %>%
filter(substr(bcr_patient_barcode, 14, 15) == "01") %>%
mutate(bcr_patient_barcode = substr(bcr_patient_barcode, 1, 12))
head(BRCA.rnaseq)
BRCA.surv %>%
left_join(BRCA.rnaseq,
by = "bcr_patient_barcode") ->
BRCA.surv_rnaseq
head(BRCA.surv_rnaseq)
table(BRCA.surv_rnaseq$cohort, useNA = "always")
BRCA.surv_rnaseq <- BRCA.surv_rnaseq %>%
filter(!is.na(cohort))
dim(BRCA.surv_rnaseq)
BRCA.surv_rnaseq.cut <- surv_cutpoint(
BRCA.surv_rnaseq,
time = "times",
event = "patient.vital_status",
variables = c("CDCA2", "AURKB")
)
summary(BRCA.surv_rnaseq.cut)
BRCA.surv %>%
left_join(BRCA.rnaseq,
by = "bcr_patient_barcode") ->
BRCA.surv_rnaseq
head(BRCA.surv_rnaseq)
BRCA.surv_rnaseq.cat <- surv_categorize(BRCA.surv_rnaseq.cut)
headBRCA.surv_rnaseq.cat)
BRCA.surv_rnaseq.cat <- BRCA.surv_rnaseq.cat %>%
mutate(cohort = BRCA.surv_rnaseq$cohort)
headBRCA.surv_rnaseq.cat)
library(survival)
fit <- survfit(Surv(times, patient.vital_status) ~ CDCA2 + AURKB,
data = BRCA.surv_rnaseq.cat)
Thanks!
@giulia_m is any column name meaningful if you run
colnames(BRCA.clinical)
then you can specifyextract.cols
insurvivalTCGA
. You can try usingdo detect whether any variables has
lum
in any name of considered factor levelshowever I can't find luminal
Thank you Marcin!
Yes I tried...
I cannot find any indication with "subtypes"...
I could only check via: http://tumorsurvival.org/TCGA/Breast_TCGA_BRCA/index.html
Here you can select Luminal A, Luminal B etc.
For example here are the results: http://tumorsurvival.org/TCGA/Breast_TCGA_BRCA/process.php
It is possible to download the excel file but then how should I deal with?
Is there a way to download sub-type dependency with the bcr_patient_barcode ? If yes, then one can join this to BRCA.clinical with dplyr::left_join