Question: Missing clinical data at TCGAbiolinks data
0
gravatar for Talip Zengin
5 weeks ago by
Talip Zengin10
Mugla, Turkiye
Talip Zengin10 wrote:

Hello, I am using TCGAbiolinks routinely to analyse molecular abnormalities by using heatmap and survival plots. I use the commands below. For 2 days, TCGAanalyze_survival has given the error below because days_to_death column is missing in expression data downloaded and prepared by GDCquery, GDCdownload and GDCprepare commands. What has changed in two days? How can I solve this problem?

query_exp2 <- GDCquery(project = paste0("TCGA-", cancer),
                       data.category = "Transcriptome Profiling",
                       data.type = "Gene Expression Quantification", 
                       workflow.type = "HTSeq - Counts",
                       sample.type = "Primary solid Tumor",
                       barcode = uniq_tsb_exp)

GDCdownload(query_exp2, files.per.chunk = 100)

GeneExp_paired2 <- GDCprepare(query_exp2, save = TRUE, save.filename = paste0(cancer, "_GeneExp_paired2.rda"))

TCGAanalyze_survival(data = colData(GeneExp_paired2),
                     clusterCol = "subtype_iCluster.Group",
                     main = "TCGA Kaplan-Meier Survival Plot for Consensus Clusters",
                     legend = "RNA Group",
                     height = 10,
                     risk.table = FALSE,
                     conf.int = FALSE,
                     color = c("black","red","blue","green3"),
                     filename = paste0(cancer, "_survival_expression_subtypes0.png"))

Error in TCGAanalyzesurvival(data = colData(GeneExppaired2), clusterCol = "subtypeiCluster.Group", : Columns vitalstatus, daystodeath and daystolastfollowup should be in data frame

I tried the same code for two version of TCGAbiolinks but they gave the same error.

package.version("TCGAbiolinks")
[1] "2.9.2"
GeneExp_paired2$days_to_death
NULL

package.version("TCGAbiolinks")
[1] "2.10.5"
GeneExp_paired2$days_to_death
NULL

I tried to get days_to_death column from clinic data but this column is missing in clinic data, too. The used code is below:

clinic <- GDCquery_clinic(project = paste0("TCGA-", cancer), type = "clinical")
clinic$days_to_death
NULL

Thanks in advance.

ADD COMMENTlink modified 5 weeks ago by Tiago Chedraoui Silva190 • written 5 weeks ago by Talip Zengin10

The API is not returning days to death anymore. And it seems it has been removed from some documentations (https://gdc.cancer.gov/clinical-data-elements), but they add year of death. I sent an email to GDC to check, but for the moment the function will not work without that information.

Thanks for noticing the problem.

ADD REPLYlink written 5 weeks ago by Tiago Chedraoui Silva190

I wonder "Days To Last Followup" may serve the same purpose as days to death when paired with "Vital status". I.e. days to the last follow up when the patient passed away = days to death.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Anh N Tran0
Answer: Missing clinical data at TCGAbiolinks data
2
gravatar for Tiago Chedraoui Silva
5 weeks ago by
Brazil - University of São Paulo/ Los Angeles - Cedars-Sinai Medical Center
Tiago Chedraoui Silva190 wrote:

There was a bug in GDC API. They are making changes and the data was removed. They fixed and the metadata should be correct.

ADD COMMENTlink written 5 weeks ago by Tiago Chedraoui Silva190

Thanks very much for your quick reply and effort. It is very appreciated :)

ADD REPLYlink written 4 weeks ago by Talip Zengin10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 119 users visited in the last hour