0
9 months ago by
Giuseppe20
Giuseppe20 wrote:

Hi all,

I'm making a function that uses input TCGA datasets. I know that for the purpose of reproducibility, inputs need to be BioC objects, and not text files.

My question is: is there any package that allows me download TCGA datasets like a BioC objects (S4) ?

Thank you so much!

tcga dataset • 300 views
modified 9 months ago by mario.zanfardino150 • written 9 months ago by Giuseppe20
0
9 months ago by
Naples (Italy)
mario.zanfardino150 wrote:

Hi Giuseppe, try curatedTCGA package:

"This package provides publicly available data from The Cancer Genome Atlas (TCGA) Bioconductor MultiAssayExperiment class objects. These objects integrate multiple assays (e.g. RNA-seq, copy number, mutation, microRNA, protein, and others) with clinical / pathological data. The MultiAssayExperiment class links assay barcodes with patient IDs, enabling harmonized subsetting of rows (features) and columns (patients / samples) across the entire experiment."

1

I executed the following instruction: curatedTCGAData("SKCM","Methylation",F) to download HumanMethylation450 dataset, but the output is not that I want.

I never used this package, can you help me to resolve this problem?

1

This run in my env:

SKCM <- curatedTCGAData(diseaseCode = "SKCM", assays = "Methylation", dry.run = FALSE)

Obviously, the data are in MultiAssayExperiment format (a Bioconductor object-oriented S4 class)(https://bioconductor.org/packages/release/bioc/vignettes/MultiAssayExperiment/inst/doc/MultiAssayExperiment.html).

1

Thank you very much for your support! I have last question.. I executed this instruction:

> experiments(SKCM)

ExperimentList class object of length 1:

[1] SKCM_Methylation-20160128: SummarizedExperiment with 485577 rows and 475 columns

I want to convert ExperimentList class to a data frame. How can I do it?

2

Probably you don't want to convert this to a data frame, but learn to use the SummarizedExperiment class, see for instance here and the package vignette here.

1
df <- as.data.frame(wideFormat(SKCM[1:10,1:10 ,"SKCM_Methylation-20160128"], colDataCols = c(1:10)))

df subsetted for 10 Patient, 10 features and 10 colData