Entering edit mode
                    António Miguel de Jesus Domingues
        
    
        ▴
    
    510
        @antonio-miguel-de-jesus-domingues-5182
        Last seen 21 months ago
        Germany
    Hi all,
I am trying to use GenomicDataCommons to retrieve RPPA data and associated patient clinical data for all cancer types in CGD. I am able to find and download the relevant files with:
library("GenomicDataCommons")
library("BiocParallel")
options(MulticoreParam=MulticoreParam(workers=4))
stopifnot(GenomicDataCommons::status()$status=="OK")
ge_manifest <- files() %>%
    filter(type == 'protein_expression') %>%
    filter(access == "open") %>%
    filter(platform == "RPPA") %>%
    manifest()
head(ge_manifest)
fnames <- gdcdata(ge_manifest$id, progress=TRUE)
fnames <- bplapply(ge_manifest$id, gdcdata, progress=FALSE)
rppa_scores <- bplapply(fnames, fread)
But I am not able to figure out how to obtain the clinical data. ge_manifest contains file ids which don't retrieve the patient data, and inside each _RPPA_data.tsv there is no obvious patient id column:
AGID lab_id catalog_number set_id peptide_target protein_expression
Any ideas on how to retrieve the patient data for those samples?
