How to obtain clinical data from TCGA via Bioconductor GenomicDataCommons
1
@0df7ded5
Last seen 2.9 years ago
Germany
Dear community,
I am totally new to TCGA and Bioconductor and I am really confused how to obtain more clinical data (e.g. for survival analysis, gender, RNA-seq read count data, ...) from some cases I got. For every "patient" I have
gdc_file_uuid (e.g. 52F6329C-CDC6-4196-A4A0-58952332905C)
filename (e.g. UNCID_1552290.d6b7779f-a245-48ee-b9a8-2570c023a531.sorted_genome_alignments.bam)
case_uuid (e.g. 2be42cc2-9b97-4821-afc2-d1e42eb3932d)
How can I use this in the R package GenomicDataCommons to get more clinical data?
I would be glad for any help!
Kind regards,
Hashirama
TCGA
GenomicDataCommons
• 1.3k views
@sean-davis-490
Last seen 3 months ago
United States
The GenomicDataCommons package can take a set of uuids for the cases to get quite a bit of clinical detail. See available_expand(cases())
for the types of data that can be returned. Here is some code to get you started
library(GenomicDataCommons)
cases() %>%
expand(c('diagnoses','demographic','diagnoses.pathology_details')) %>%
GenomicDataCommons::filter(case_id %in% c("2be42cc2-9b97-4821-afc2-d1e42eb3932d")) %>%
results() %>%
tibble::as_tibble() %>%
dplyr::glimpse()
Results:
Rows: 1
Columns: 22
$ id <chr> "2be42cc2-9b97-4821-afc2-d1e42eb3932d"
$ slide_ids <named list> <"9a182c4a-6085-4829-a3d0-c46114f0875b", "4236…
$ submitter_slide_ids <named list> <"TCGA-HZ-7926-01Z-00-DX1", "TCGA-HZ-79…
$ disease_type <chr> "Ductal and Lobular Neoplasms"
$ analyte_ids <named list> <"05fce9a0-fa4d-4a30-ad33-a4f04bf84abf"…
$ submitter_id <chr> "TCGA-HZ-7926"
$ submitter_analyte_ids <named list> <"TCGA-HZ-7926-01A-11R", "TCGA-HZ-7926-10A-01W…
$ aliquot_ids <named list> <"1925e7c2-1730-48a4-8257-772fc4448d9b"…
$ submitter_aliquot_ids <named list> <"TCGA-HZ-7926-10A-01D-2153-01", "TCGA-HZ-7926…
$ diagnoses <named list> [<data.frame[1 x 28]>]
$ diagnosis_ids <named list> "f172c483-6888-5e06-9e5c-0b2bb4be64dd"
$ created_datetime <lgl> NA
$ sample_ids <named list> <"8b7bd592-74f0-48e3-9e21-8005ab8d419e"…
$ demographic <df[,14]> <data.frame[1 x 14]>
$ submitter_sample_ids <named list> <"TCGA-HZ-7926-01A", "TCGA-HZ-7926-10A"…
$ submitter_diagnosis_ids <named list> "TCGA-HZ-7926_diagnosis"
$ primary_site <chr> "Pancreas"
$ updated_datetime <chr> "2019-08-06T14:42:37.317113-05:00"
$ case_id <chr> "2be42cc2-9b97-4821-afc2-d1e42eb3932d"
$ portion_ids <named list> <"de913076-84e6-4ed7-8f2f-16cdd2a7f7b0"…
$ state <chr> "released"
$ submitter_portion_ids <named list> <"TCGA-HZ-7926-01A-11", "TCGA-HZ-7926-1…
Login before adding your answer.
Traffic: 540 users visited in the last hour
cross-posted: https://www.biostars.org/p/9499402