Question: Missing clinical data at TCGAbiolinks data
gravatar for Talip Zengin
7 months ago by
Talip Zengin10
Mugla, Turkiye
Talip Zengin10 wrote:

Hello, I am using TCGAbiolinks routinely to analyse molecular abnormalities by using heatmap and survival plots. I use the commands below. For 2 days, TCGAanalyze_survival has given the error below because days_to_death column is missing in expression data downloaded and prepared by GDCquery, GDCdownload and GDCprepare commands. What has changed in two days? How can I solve this problem?

query_exp2 <- GDCquery(project = paste0("TCGA-", cancer),
                       data.category = "Transcriptome Profiling",
                       data.type = "Gene Expression Quantification", 
                       workflow.type = "HTSeq - Counts",
                       sample.type = "Primary solid Tumor",
                       barcode = uniq_tsb_exp)

GDCdownload(query_exp2, files.per.chunk = 100)

GeneExp_paired2 <- GDCprepare(query_exp2, save = TRUE, save.filename = paste0(cancer, "_GeneExp_paired2.rda"))

TCGAanalyze_survival(data = colData(GeneExp_paired2),
                     clusterCol = "subtype_iCluster.Group",
                     main = "TCGA Kaplan-Meier Survival Plot for Consensus Clusters",
                     legend = "RNA Group",
                     height = 10,
                     risk.table = FALSE,
            = FALSE,
                     color = c("black","red","blue","green3"),
                     filename = paste0(cancer, "_survival_expression_subtypes0.png"))

Error in TCGAanalyzesurvival(data = colData(GeneExppaired2), clusterCol = "subtypeiCluster.Group", : Columns vitalstatus, daystodeath and daystolastfollowup should be in data frame

I tried the same code for two version of TCGAbiolinks but they gave the same error.

[1] "2.9.2"

[1] "2.10.5"

I tried to get days_to_death column from clinic data but this column is missing in clinic data, too. The used code is below:

clinic <- GDCquery_clinic(project = paste0("TCGA-", cancer), type = "clinical")

Thanks in advance.

ADD COMMENTlink modified 7 months ago by Tiago Chedraoui Silva240 • written 7 months ago by Talip Zengin10

The API is not returning days to death anymore. And it seems it has been removed from some documentations (, but they add year of death. I sent an email to GDC to check, but for the moment the function will not work without that information.

Thanks for noticing the problem.

ADD REPLYlink written 7 months ago by Tiago Chedraoui Silva240

I wonder "Days To Last Followup" may serve the same purpose as days to death when paired with "Vital status". I.e. days to the last follow up when the patient passed away = days to death.

ADD REPLYlink modified 7 months ago • written 7 months ago by Anh N Tran0
Answer: Missing clinical data at TCGAbiolinks data
gravatar for Tiago Chedraoui Silva
7 months ago by
Brazil - University of São Paulo/ Los Angeles - Cedars-Sinai Medical Center
Tiago Chedraoui Silva240 wrote:

There was a bug in GDC API. They are making changes and the data was removed. They fixed and the metadata should be correct.

ADD COMMENTlink written 7 months ago by Tiago Chedraoui Silva240

Thanks very much for your quick reply and effort. It is very appreciated :)

ADD REPLYlink written 6 months ago by Talip Zengin10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 204 users visited in the last hour